Release 1.5
Commit | Description |
---|---|
Wed Jun 12 15:02:49 CEST 2013 by Michael Peter Christen | added a new 'Citations' function: each search result item can now be explored for citations within other documents. A click on the 'Citations' link shows an analysis with all text lines in the document each with a complete list of documents which contain the same line. A second section shows the linking documents in ascending order of number of citations from the original document. Because documents from different hosts are most interesting here, they are listed at the top of the page as possible 'copypasta' source. Changed Files: defaults/yacy.init, htroot/ConfigPortal.html, htroot/ConfigPortal.java, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/api/citation.html, htroot/api/citation.java, htroot/yacysearchitem.html, htroot/yacysearchitem.java, source/net/yacy/cora/document/MultiProtocolURI.java, source/net/yacy/cora/federate/solr/responsewriter/GrepHTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java |
Tue Jun 11 14:42:30 CEST 2013 by Michael Peter Christen | added a 'greedy learning' mechanismn which will cause that a 'fresh' yacy will load linked web pages from search results until the total number of web pages reaches 15000. This shall give fresh peers a 'boost' to get faster a personalized search index. Changed Files: defaults/yacy.init, defaults/yacy.network.freeworld.unit, defaults/yacy.network.intranet.unit, defaults/yacy.network.metager.unit, defaults/yacy.network.webportal.unit, htroot/ConfigHeuristics_p.java, htroot/ConfigNetwork_p.java, htroot/yacysearch.java, htroot/yacysearchitem.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java |
Mon Jun 10 16:22:00 CEST 2013 by Michael Peter Christen | added new buttons to search result page in p2p mode which show the switch between p2p search and the 'stealth mode' which is simply a non-p2p search within the p2p network. The functionality was there all the time, but the switch to this was not very visible. Changed Files: htroot/env/base.css, htroot/env/grafics/searchmode_p2p_activated_32.png, htroot/env/grafics/searchmode_p2p_deactivated_32.png, htroot/env/grafics/searchmode_stealth_activated_32.png, htroot/env/grafics/searchmode_stealth_deactivated_32.png, htroot/index.html, htroot/index.java, htroot/yacysearch.html, htroot/yacysearch.java |
Sun Jun 09 12:12:34 CEST 2013 by orbiter | replaced yacydoc servlet usage by a solr result output using an html output writer. This made the creation of a html result writer necessary which is included in this commit. The yacydoc servlet was used to present all metadata to a document, but the solr interface can serve for this purpose in a much better way. All usages (instead one) of yacydoc were replaced by a solr call. This affects also the 'metadata' link attached to search results. Changed Files: htroot/ConfigSearchPage_p.html, htroot/IndexControlURLs_p.html, htroot/ViewFile.html, htroot/solr/select.java, htroot/yacysearchitem.html, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java |
Fri Jun 07 13:20:57 CEST 2013 by Michael Peter Christen | Added a citation reference computation for intra-domain link structures. While the values for the reference evaluation are computed, also a backlink-structure can be discovered and written to the index as well. The host browser has been extended to show such backlinks to each presented links. The host browser therefore can now show an information where an document is linked. The new citation reference is computed as likelyhood for a random click path with recursive usage of previously computed likelyhood. This process is repeated until the likelyhood converges to a specific number. This number is then normalized to a ranking value CRn, 0<=CRn<=1. The value CRn can therefore be used to rank popularity within intra-domain link structures. Changed Files: defaults/solr.collection.schema, htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/ProcessType.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/workflow/WorkflowProcessor.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Thu Jun 06 22:07:54 CEST 2013 by reger | - fix stopword handling for RWI see example http://bugs.yacy.net/view.php?id=247 - append language setting specific stopword list - remove unused OVERHANG stack type Changed Files: htroot/yacysearch.java, source/net/yacy/cora/storage/HandleSet.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/NoticedURL.java, source/net/yacy/kelondro/index/RowHandleSet.java, source/net/yacy/kelondro/util/SetTools.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/query/SearchEvent.java, yacy.stopwords |
Sat Jun 01 05:43:08 CEST 2013 by reger | enable use of solrcore.properties for property substitution of solrconfig.xml - move setting of system property solr.directoryFactory=solr.MMapDirectoryFactory to solrcore.properties - add check of os.arch for 64bit system, if it fails use default/solrcore.x86.properties (if exists) as solrcore.properties reason: on 32bit MMapDirectoryFactory may fail with..... Caused by: java.io.IOException: Map failed at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:849) at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:283) Changed Files: defaults/solr/solrcore.properties, defaults/solr/solrcore.x86.properties, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/yacy.java |
Wed May 29 18:27:27 CEST 2013 by Michael Peter Christen | removed 'later' tactic because it used too much RAM, reduced number of soft commits, reduced caching size of search events, ensured that solr results are processed before connection is closed to keep that stuff not too long in RAM Changed Files: defaults/solr/solrconfig.xml, htroot/yacy/crawlReceipt.java, htroot/yacy/transferURL.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/SearchEvent.java |
Mon May 20 22:05:28 CEST 2013 by Michael Peter Christen | reduced locking situation in crawler: shifted synchronized location and reduced time-out of robots.txt load limit Changed Files: htroot/Bookmarks.java, htroot/CrawlCheck_p.java, htroot/Crawler_p.java, htroot/DictionaryLoader_p.java, htroot/Load_RSS_p.java, htroot/ViewFile.java, htroot/ViewImage.java, htroot/api/getpageinfo.java, htroot/api/getpageinfo_p.java, htroot/api/webstructure.java, htroot/yacysearch.java, htroot/yacysearchitem.java, source/net/yacy/crawler/Balancer.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/crawler/retrieval/RSSLoader.java, source/net/yacy/crawler/robots/RobotsTxt.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/data/ymark/YMarkMetadata.java, source/net/yacy/document/importer/OAIListFriendsLoader.java, source/net/yacy/document/importer/OAIPMHLoader.java, source/net/yacy/peers/graphics/OSMTile.java, source/net/yacy/peers/operation/yacyRelease.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/snippet/MediaSnippet.java, source/net/yacy/search/snippet/TextSnippet.java |
Fri May 17 13:59:37 CEST 2013 by Michael Peter Christen | redesign of index.exist-test: this shall now not be done using a single id to be tested, but with a collection of ids. This will cause only a single call to solr instead of many. The result is a much better performace when testing the existence of many urls. The effect should cause very much less IO during index transmission, both on sender and receiver side. Changed Files: htroot/HostBrowser.java, htroot/IndexControlRWIs_p.java, htroot/Load_RSS_p.java, htroot/yacy/transferRWI.java, htroot/yacy/transferURL.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/protocol/ftp/FTPClient.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/crawler/retrieval/RSSLoader.java, source/net/yacy/crawler/retrieval/SitemapImporter.java, source/net/yacy/kelondro/workflow/WorkflowProcessor.java, source/net/yacy/migration.java, source/net/yacy/peers/Transmission.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/server/serverObjects.java |
Mon May 13 13:28:07 CEST 2013 by Michael Peter Christen | refactoring of WorkflowProcessor, added process counter, update of process counter if an blocking thread dies. Added also a new column in PerformanceConcurrency_p servlet to show the actual number of concurrent processes. Changed Files: htroot/PerformanceConcurrency_p.html, htroot/PerformanceConcurrency_p.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/kelondro/workflow/AbstractBlockingThread.java, source/net/yacy/kelondro/workflow/InstantBlockingThread.java, source/net/yacy/kelondro/workflow/WorkflowProcessor.java, source/net/yacy/peers/Dispatcher.java, source/net/yacy/search/Switchboard.java |
Thu May 09 02:17:53 CEST 2013 by Michael Peter Christen | migrated to solr 4.3.0 Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, defaults/solr/schema.xml, defaults/solr/solrconfig.xml, lib/lucene-analyzers-common-4.3.0.jar, lib/lucene-analyzers-phonetic-4.3.0.jar, lib/lucene-classification-4.3.0.jar, lib/lucene-codecs-4.3.0.jar, lib/lucene-core-4.3.0.jar, lib/lucene-facet-4.3.0.jar, lib/lucene-grouping-4.3.0.jar, lib/lucene-highlighter-4.3.0.jar, lib/lucene-join-4.3.0.jar, lib/lucene-memory-4.3.0.jar, lib/lucene-misc-4.3.0.jar, lib/lucene-queries-4.3.0.jar, lib/lucene-queryparser-4.3.0.jar, lib/lucene-spatial-4.3.0.jar, lib/lucene-suggest-4.3.0.jar, lib/noggit-0.5.jar, lib/solr-core-4.3.0.jar, lib/solr-solrj-4.3.0.License, lib/solr-solrj-4.3.0.jar |
Thu May 09 00:22:45 CEST 2013 by Michael Peter Christen | - upgraded httpclient, httpcore and httpmime - removed httpclient 3.1 which has been used by solrj < 4.x.x and is now not used any more - fixed some parts in YaCy which used methods from httpclient 3.1 Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/httpclient-4.2.5.License, lib/httpclient-4.2.5.jar, lib/httpcore-4.2.4.License, lib/httpcore-4.2.4.jar, lib/httpmime-4.2.5.License, lib/httpmime-4.2.5.jar, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/server/http/ChunkedInputStream.java |
Wed May 08 11:50:46 CEST 2013 by Michael Peter Christen | prevent that the size of the index is computed too many times. Because the index size is now provided by solr, and the only way to do that is a match for [* TO *], a size computation is quite complex and time-consuming. Therefore this patch prevents that the method is called at all and if necessary puts a DOS-preventing barrier in front of it. Changed Files: htroot/HostBrowser.java, htroot/IndexControlURLs_p.java, htroot/PerformanceGraph.java, htroot/yacy/hello.java, htroot/yacy/query.java, htroot/yacyinteractive.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java |
Mon May 06 16:45:54 CEST 2013 by Michael Peter Christen | re-declared some fields to be of type string rather than text which makes them more efficient and less large Changed Files: defaults/solr.collection.schema, htroot/Crawler_p.java, htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/SolrType.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java |
Sat Apr 27 01:32:18 CEST 2013 by Michael Peter Christen | refactoring (renaming) of yacy-solr api Changed Files: htroot/HostBrowser.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/WebgraphConfiguration.java |
Fri Apr 26 10:49:55 CEST 2013 by Michael Peter Christen | - added a new field for the regular expression in crawl start - added the field in crawl profile - adopted logging end error management - adopted duplicate document detection - added a new rule to the indexing process to reject non-matching content - full redesign of the expert crawl start servlet The new filter field can now be seen in /CrawlStartExpert_p.html at Section "Document Filter", subsection item "Filter on Content of Document" Changed Files: htroot/CrawlProfileEditor_p.java, htroot/CrawlStartExpert_p.html, htroot/CrawlStartExpert_p.java, htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, htroot/env/base.css, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/document/Document.java, source/net/yacy/search/Switchboard.java |
Thu Apr 25 11:33:17 CEST 2013 by orbiter | - reduction of the concurrently running processes to make YaCy more adjusted to smaller and 1-core devices. - the workflow processor now starts no process at all. these are started as soon as parser/condenser/indexing queues are filled. - better abstraction Changed Files: htroot/ViewImage.java, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/kelondro/blob/MapHeap.java, source/net/yacy/kelondro/data/word/Word.java, source/net/yacy/kelondro/data/word/WordReferenceVars.java, source/net/yacy/kelondro/index/RowHandleMap.java, source/net/yacy/kelondro/workflow/AbstractBlockingThread.java, source/net/yacy/kelondro/workflow/AbstractThread.java, source/net/yacy/kelondro/workflow/InstantBlockingThread.java, source/net/yacy/kelondro/workflow/WorkflowProcessor.java, source/net/yacy/peers/Dispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/snippet/TextSnippet.java |
Tue Apr 16 14:45:14 CEST 2013 by Michael Peter Christen | fixed ranking for add-function queries: this did not work. The option was removed. All function queries are now boosts (multiplies the score according to a function). This is also the recommended way to boost rankings based on functions as explained in http://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/ Changed Files: defaults/yacy.init, htroot/RankingSolr_p.html, htroot/RankingSolr_p.java, htroot/gsa/searchresult.java, htroot/solr/select.java, source/net/yacy/cora/federate/solr/Ranking.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java |
Mon Apr 15 14:08:30 CEST 2013 by Michael Peter Christen | redesign of exists()-query (can now be called with query) and the CachedSolrConnector which based its cache on the key value. This will be used to correct the title_unique_b and description_unique_b field. Changed Files: htroot/PerformanceMemory_p.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java |
Sat Apr 06 16:11:24 CEST 2013 by Michael Peter Christen | upgrade to solr 4.2.1 Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/lucene-analyzers-common-4.2.1.jar, lib/lucene-analyzers-phonetic-4.2.1.jar, lib/lucene-classification-4.2.1.jar, lib/lucene-core-4.2.1.jar, lib/lucene-facet-4.2.1.jar, lib/lucene-grouping-4.2.1.jar, lib/lucene-highlighter-4.2.1.jar, lib/lucene-join-4.2.1.jar, lib/lucene-memory-4.2.1.jar, lib/lucene-misc-4.2.1.jar, lib/lucene-queries-4.2.1.jar, lib/lucene-queryparser-4.2.1.jar, lib/lucene-spatial-4.2.1.jar, lib/lucene-suggest-4.2.1.jar, lib/solr-core-4.2.1.jar, lib/solr-solrj-4.2.1.jar, lib/solr.License, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/search/query/QueryParams.java |
Commit | Description |
---|---|
Thu Jun 13 22:42:21 CEST 2013 by Michael Peter Christen | npe fix Changed Files: source/net/yacy/kelondro/blob/Heap.java |
Thu Jun 13 18:27:57 CEST 2013 by orbiter | fix for citation search in case that the citation is very fresh Changed Files: htroot/api/citation.java |
Wed Jun 12 13:23:58 CEST 2013 by Michael Peter Christen | npe fix Changed Files: source/net/yacy/cora/sorting/OrderedScoreMap.java |
Tue Jun 11 16:22:43 CEST 2013 by Michael Peter Christen | added fixed clear method as public method Changed Files: source/net/yacy/crawler/data/NoticedURL.java |
Fri Jun 07 00:13:45 CEST 2013 by reger | add null pointer check to stopword fix Changed Files: source/net/yacy/search/query/SearchEvent.java |
Tue May 28 16:26:38 CEST 2013 by orbiter | prevent NPE in case RWI is disabled Changed Files: htroot/PerformanceQueues_p.java, htroot/yacy/query.java, htroot/yacy/search.java, htroot/yacysearch.java, source/net/yacy/peers/Dispatcher.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/snippet/ResultEntry.java |
Sun May 26 03:24:32 CEST 2013 by reger | fix DHT url receive see http://bugs.yacy.net/view.php?id=242 Changed Files: htroot/yacy/transferURL.java |
Mon May 13 13:27:01 CEST 2013 by Michael Peter Christen | fixed query expressions for collection selection (added quotes) Changed Files: source/net/yacy/search/query/QueryModifier.java |
Sun May 12 21:36:20 CEST 2013 by orbiter | fix for workflow processor (cause: latest redesign for less threads) Changed Files: source/net/yacy/kelondro/workflow/WorkflowProcessor.java |
Sat May 11 11:19:06 CEST 2013 by Michael Peter Christen | small memory leak patch Changed Files: source/net/yacy/crawler/data/Latency.java, source/net/yacy/repository/LoaderDispatcher.java |
Tue Apr 30 11:06:48 CEST 2013 by Michael Peter Christen | infinity timeout bug protection patch Changed Files: source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java, source/net/yacy/crawler/Balancer.java, source/net/yacy/kelondro/data/word/WordReferenceFactory.java, source/net/yacy/kelondro/data/word/WordReferenceVars.java, source/net/yacy/peers/Dispatcher.java, source/net/yacy/peers/graphics/WebStructureGraph.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/ranking/ReferenceOrder.java |
Sun Apr 28 20:09:45 CEST 2013 by Michael Peter Christen | fixed bad css change Changed Files: htroot/env/base.css |
Sun Apr 21 12:27:27 CEST 2013 by Michael Peter Christen | fixed default ranking values Changed Files: htroot/RankingSolr_p.java |
Sat Apr 20 10:53:49 CEST 2013 by orbiter | avoid NPE in regex checker Changed Files: source/net/yacy/repository/RegexHelper.java |
Tue Apr 16 13:32:13 CEST 2013 by Michael Peter Christen | fix for result counter logging Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java |
Tue Apr 16 12:38:16 CEST 2013 by Michael Peter Christen | fix to ranking configuration servlet Changed Files: htroot/RankingSolr_p.html |
Tue Apr 16 01:39:30 CEST 2013 by Michael Peter Christen | fixed api table navigation Changed Files: htroot/Table_API_p.java |
Sun Apr 14 02:01:27 CEST 2013 by reger | skip postprocessing during document.store if no citation index connected (prevent null pointer exception) Changed Files: source/net/yacy/search/index/Segment.java |
Sat Apr 06 02:29:49 CEST 2013 by reger | fix typo in prev commit Changed Files: htroot/AccessTracker_p.html |
Wed Mar 20 16:19:49 CET 2013 by Michael Peter Christen | fixes for better search interface integration in yaml templates Changed Files: htroot/solr/select.java, htroot/yacysearch.java, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java |
Mon Mar 18 00:10:23 CET 2013 by reger | fix invisible icon not found Changed Files: htroot/HostBrowser.html |
Commit | Description |
---|---|
Thu Jun 13 23:50:00 CEST 2013 by Michael Peter Christen | Release 1.5 Changed Files: build.properties |
Thu Jun 13 22:40:46 CEST 2013 by Michael Peter Christen | typo Changed Files: htroot/Steering.html |
Thu Jun 13 22:32:06 CEST 2013 by Michael Peter Christen | increased time-out for loading of seed-lists Changed Files: source/net/yacy/search/Switchboard.java |
Thu Jun 13 22:31:39 CEST 2013 by Michael Peter Christen | added target="_blank" to shutdown links Changed Files: htroot/Steering.html |
Thu Jun 13 14:44:47 CEST 2013 by orbiter | added a feed-back message inside the shutdown page Changed Files: htroot/Steering.html |
Thu Jun 13 13:22:43 CEST 2013 by Michael Peter Christen | show the citation report also in ViewFile Changed Files: htroot/ViewFile.html, htroot/ViewFile.java |
Thu Jun 13 13:08:24 CEST 2013 by Michael Peter Christen | fixed usage of ViewFile which needs a commit before showing latest crawl result pages. Changed Files: htroot/ViewFile.java |
Thu Jun 13 13:03:56 CEST 2013 by Michael Peter Christen | removed warning message during crawling Changed Files: source/net/yacy/crawler/CrawlStacker.java |
Thu Jun 13 13:01:28 CEST 2013 by Michael Peter Christen | removed fields references_internal_id_sxt and references_internal_url_sxt because they had been shown to be superfluous. The citation of referrer in the host browser is possible without them. Therefore now the host browser does not only show internal, but also external referrer to each link. Changed Files: defaults/solr.collection.schema, htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Wed Jun 12 11:29:35 CEST 2013 by Michael Peter Christen | switching back to the merge factor 10; the solr default. Changed Files: defaults/solr/solrconfig.xml |
Wed Jun 12 02:13:18 CEST 2013 by Michael Peter Christen | added synchronizations and timeouts in solr api; missing synchronizations in index modification methods causes deadlocks inside solr. Changed Files: defaults/yacy.init, htroot/IndexFederated_p.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java |
Wed Jun 12 00:17:44 CEST 2013 by Michael Peter Christen | calling pdf cache flush on class initialization because calling of the methods during runtime can conflict with dynamic solr class loader and cause a deadlock (seriously!) Changed Files: source/net/yacy/document/parser/pdfParser.java |
Wed Jun 12 00:16:28 CEST 2013 by Michael Peter Christen | removed misleading http accessGranted message (this is only for debugging) Changed Files: source/net/yacy/server/http/HTTPDFileHandler.java |
Wed Jun 12 00:14:55 CEST 2013 by Michael Peter Christen | reduced load on solr; no seed update in Status and no exists-check in HTTPLoader in case of redirects, that can be done using the htcache. Changed Files: htroot/Status.java, source/net/yacy/crawler/retrieval/HTTPLoader.java |
Wed Jun 12 00:12:04 CEST 2013 by Michael Peter Christen | changed administration page headline to 'admnistration' Changed Files: htroot/env/templates/header.template |
Wed Jun 12 00:10:25 CEST 2013 by Michael Peter Christen | changed windows icon again Changed Files: addon/YaCy.ico, addon/YaCy_TrayIcon.png |
Tue Jun 11 16:51:40 CEST 2013 by Michael Peter Christen | increased the solr merge factor because 4 was too much IO load for frequent index receiving and re-indexing after clickdepth/cr calculation. Changed Files: defaults/solr/solrconfig.xml |
Tue Jun 11 16:50:34 CEST 2013 by Michael Peter Christen | changed p2p/stealth mode text and links a bit Changed Files: htroot/yacysearch.html |
Tue Jun 11 14:52:46 CEST 2013 by Michael Peter Christen | allip net has greedy learning disabled Changed Files: defaults/yacy.network.allip.unit |
Tue Jun 11 14:51:26 CEST 2013 by Michael Peter Christen | removed forced soft commit since this may be the cause for a performance problem Changed Files: source/net/yacy/search/index/Segment.java |
Tue Jun 11 13:16:46 CEST 2013 by Michael Peter Christen | new icons Changed Files: addon/YaCy.app/Contents/Info.plist, addon/YaCy.app/Contents/Resources/YaCy_2013_Icon.icns, addon/YaCy.ico, htroot/favicon.bmp, htroot/favicon.ico, htroot/favicon.png |
Tue Jun 11 13:12:59 CEST 2013 by Michael Peter Christen | use s greeting line which does not sound so beta Changed Files: source/net/yacy/gui/InfoPage.java |
Mon Jun 10 18:41:00 CEST 2013 by Michael Peter Christen | added another response writer which can present search result with texts, separated by sentences. Then, these sentences can be used to search again in the index for the same sentence. This can be used to provide a tool for plagiarism-search. (not finished yet). Try the following: http://localhost:8090/solr/select?q=text_t:flut&grep=wasser&defType=edismax&start=0&rows=3&core=collection1&wt=grephtml .. to search for 'flut' and show only sentences in the result documents which contain the word 'wasser'. Consider this like using a grep-tool on documents: you select the documents by a search query and you grep sentences inside the found documents with the 'grep' attribute. Changed Files: htroot/solr/select.java, source/net/yacy/cora/federate/solr/responsewriter/GrepHTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java |
Mon Jun 10 18:36:06 CEST 2013 by Michael Peter Christen | the line "Web Search by the People, for the People" is more generic for P2P and portal search as default search string. Otherwise, if people switch to Portal mode, the "P2P Web Search" does not make sense. Changed Files: defaults/yacy.init |
Mon Jun 10 16:23:58 CEST 2013 by Michael Peter Christen | fix for host compare in case that the host is null. This happens when doing a search in the intranet for file resources (they don't have a host). Changed Files: source/net/yacy/search/Switchboard.java |
Sun Jun 09 08:15:23 CEST 2013 by orbiter | show the cache link in search results only if there is actually a cache entry stored in HTCACHE Changed Files: htroot/yacysearchitem.java |
Fri Jun 07 14:26:14 CEST 2013 by Michael Peter Christen | activated citation ranking by default Changed Files: defaults/solr.collection.schema |
Fri Jun 07 13:22:22 CEST 2013 by Michael Peter Christen | usage of the new normalized link polularity CRn as default ranking function. This replaces the previous formula, which was bad. Before you update to this version, please check if you changed the ranking function yourself before, since it will be overwritten. Changed Files: defaults/yacy.init, source/net/yacy/search/SwitchboardConstants.java |
Fri Jun 07 12:52:03 CEST 2013 by Michael Peter Christen | patch in HTCache and CitationIndex loading in case that a file is broken: do not crash; instead ignore the file and delete it. Changed Files: source/net/yacy/crawler/data/Cache.java, source/net/yacy/kelondro/blob/ArrayStack.java, source/net/yacy/kelondro/io/CachedFileWriter.java, source/net/yacy/kelondro/rwi/ReferenceContainerArray.java |
Fri Jun 07 08:52:07 CEST 2013 by Michael Peter Christen | fixes to index deletion: quoting of host name (a '-' may be part of the url) and disabling the engage button when changing the url field at 'Delete by URL matching' Changed Files: htroot/IndexDeletion_p.html, htroot/IndexDeletion_p.java |
Thu Jun 06 13:36:58 CEST 2013 by orbiter | in GSA api enable usage of solr fq-attribute together with GSA site-attribute Changed Files: htroot/gsa/searchresult.java |
Sun Jun 02 13:50:12 CEST 2013 by Michael Peter Christen | fix for bad exists 'enhancement'; see bug: http://bugs.yacy.net/view.php?id=245 Changed Files: source/net/yacy/search/index/Fulltext.java |
Sat Jun 01 05:50:03 CEST 2013 by reger | fix: enable use of solrcore.properties for property substitution of solrconfig.xml Changed Files: source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java |
Thu May 30 16:39:48 CEST 2013 by Michael Peter Christen | added missing class Changed Files: source/net/yacy/search/StorageQueueEntry.java |
Thu May 30 16:30:35 CEST 2013 by Michael Peter Christen | ranking and boost function update, small bugfixes, better default search field for solr Changed Files: defaults/solr/solrconfig.xml, defaults/yacy.init, htroot/IndexControlRWIs_p.html |
Thu May 30 13:01:22 CEST 2013 by Michael Peter Christen | removed block rank ranking and all YBR files in /ranking Changed Files: build.xml, htroot/IndexControlRWIs_p.html, htroot/IndexControlRWIs_p.java, htroot/RankingRWI_p.java, htroot/index.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/ranking/BlockRank.java, source/net/yacy/search/ranking/RankingProfile.java, source/net/yacy/search/ranking/ReferenceOrder.java |
Thu May 30 12:47:22 CEST 2013 by Michael Peter Christen | cleanup Changed Files: htroot/IndexControlRWIs_p.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java |
Thu May 30 12:39:28 CEST 2013 by Michael Peter Christen | added timeout for remote searches of 10 seconds Changed Files: source/net/yacy/cora/federate/solr/instance/RemoteInstance.java |
Thu May 30 12:38:54 CEST 2013 by Michael Peter Christen | try to commit in case of failure which hopefully frees up some RAM Changed Files: source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java |
Thu May 30 12:38:15 CEST 2013 by Michael Peter Christen | Store node/solr search threads to be able to send them an interrupt signal in case that a cleanup process wants to remove the search process. Added also a new cleanup process which can reduce the number of stored searches to a specific number which can be higher or lower according to the remaining RAM. The cleanup process is called every time a search ist started. Changed Files: source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java |
Thu May 30 12:35:47 CEST 2013 by Michael Peter Christen | remove text_t in search result after snippet has been computed to save space in search result cache Changed Files: source/net/yacy/search/snippet/ResultEntry.java |
Thu May 30 12:34:53 CEST 2013 by Michael Peter Christen | new workflow processor in Segment to enqueue indexing documents to solr Changed Files: source/net/yacy/kelondro/data/word/Word.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/ReindexSolrBusyThread.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java |
Thu May 30 12:31:28 CEST 2013 by Michael Peter Christen | default configuration of MMapDirectoryFactory for solr, increased lock timeout, less documents from remote searches (too many results had easily blocked a peer) Changed Files: defaults/solr/solrconfig.xml, defaults/yacy.init, source/net/yacy/yacy.java |
Wed May 29 16:09:05 CEST 2013 by Michael Peter Christen | getting the trash out Changed Files: source/net/yacy/document/parser/pdfParser.java, source/net/yacy/kelondro/data/word/Word.java, source/net/yacy/search/Switchboard.java |
Wed May 29 13:45:22 CEST 2013 by Michael Peter Christen | added new link for SMW Changed Files: htroot/ContentControl_p.html |
Wed May 29 13:42:38 CEST 2013 by Michael Peter Christen | removed dead link Changed Files: htroot/ContentControl_p.html |
Wed May 29 13:30:32 CEST 2013 by Michael Peter Christen | less logging Changed Files: source/net/yacy/peers/Protocol.java |
Wed May 29 13:10:32 CEST 2013 by Michael Peter Christen | added new keys for update locations Changed Files: defaults/yacy.network.allip.unit, defaults/yacy.network.freeworld.unit, defaults/yacy.network.intranet.unit, defaults/yacy.network.metager.unit, defaults/yacy.network.webportal.unit |
Wed May 29 13:09:34 CEST 2013 by Michael Peter Christen | added option to re-boot the embedded solr during run-time. Added also API recording for this method so it can be repeated automatically. The index dump generation is now also available for API recording. Added some synchronization in backend which was necessary for this. Changed Files: htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, source/net/yacy/search/index/Fulltext.java |
Wed May 29 12:02:19 CEST 2013 by Michael Peter Christen | fixed ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.util.List; in robots.txt servlet Changed Files: htroot/robots.java |
Tue May 28 11:38:45 CEST 2013 by Michael Peter Christen | use a retry handler with retryCount=0 because we usually expect requests to fail if we access non-permanently available resources (peers, web pages) and want to fail fast without repeating the same request which is doomed to fail. The previous appearance of http client connection had a 1-2-4-8-second timeout scheme, which caused that connection attempts lasted for 16 seconds. Changed Files: source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/protocol/http/HTTPClient.java |
Tue May 28 11:35:56 CEST 2013 by Michael Peter Christen | include API Table deletion requests to the API recorder Changed Files: htroot/Table_API_p.java |
Tue May 28 10:36:49 CEST 2013 by Michael Peter Christen | activating pollImmediately in case that DHT receive is off. This will cause a much faster search result when running in public robinson mode. Changed Files: source/net/yacy/search/query/SearchEvent.java |
Tue May 28 10:33:41 CEST 2013 by Michael Peter Christen | fixed missing thisaddress in yacysearch.html which caused that the opensearch link was not working Changed Files: htroot/yacysearch.java |
Mon May 27 16:15:58 CEST 2013 by Michael Peter Christen | added a (badly formatted) delete button for process scheduler entries Changed Files: htroot/Table_API_p.html, htroot/Table_API_p.java |
Mon May 27 15:23:12 CEST 2013 by orbiter | set a higher limit for table copy usage Changed Files: source/net/yacy/kelondro/table/Table.java |
Mon May 27 13:45:09 CEST 2013 by Michael Peter Christen | javadoc of new multiple-exist test Changed Files: source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java |
Sat May 25 12:56:43 CEST 2013 by Marc Nause | *) simplified banner creation code Changed Files: htroot/Banner.java, source/net/yacy/peers/graphics/Banner.java, source/net/yacy/peers/graphics/BannerData.java |
Sat May 25 11:08:06 CEST 2013 by Marc Nause | *) updated links to description of regex Changed Files: htroot/Blacklist_p.html |
Mon May 20 11:25:26 CEST 2013 by Michael Peter Christen | nice crawl name if crawl is started with file:// (was: null) Changed Files: htroot/Crawler_p.java |
Mon May 20 11:02:21 CEST 2013 by Michael Peter Christen | added the reindexing job servlet to the submenu structure Changed Files: htroot/IndexReindexMonitor_p.html, htroot/env/templates/submenuIndexControl.template |
Mon May 20 01:50:09 CEST 2013 by reger | - odt & ooxml (office document) parser correction to add content to fulltext index - adjust Junit yacyVersionTest & ParserTest - update yacyVersion.combined2prettyVersion to the default 4-digit minor ver. Changed Files: source/net/yacy/document/parser/odtParser.java, source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/peers/operation/yacyVersion.java, test/de/anomic/document/ParserTest.java, test/de/anomic/yacy/yacyVersionTest.java |
Fri May 17 14:11:10 CEST 2013 by Michael Peter Christen | - no downcase when using collection modifier - removed warnings Changed Files: source/net/yacy/crawler/retrieval/RSSLoader.java, source/net/yacy/search/query/QueryModifier.java |
Wed May 15 23:16:32 CEST 2013 by reger | more generic field selection for reindex option of documents with disabled fields using Luke request to compare config with actual fields in index Changed Files: source/net/yacy/migration.java, source/net/yacy/search/index/ReindexSolrBusyThread.java |
Wed May 15 22:42:05 CEST 2013 by Michael Peter Christen | reject bad solr requests Changed Files: htroot/solr/select.java, source/net/yacy/server/serverObjects.java |
Mon May 13 13:26:24 CEST 2013 by Michael Peter Christen | enhanced deletion process for very large number of documents Changed Files: source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java |
Mon May 13 04:06:57 CEST 2013 by reger | added reindex option for documents with disabled or obsolete fields to Solr Schema Editor page (IndexSchema_p.html) this allows to remove obsolete fields from the index (according to current schema config) by selecting all documents containig disabled fields. Changed Files: htroot/IndexReIndexMonitor_p.java, htroot/IndexReindexMonitor_p.html, htroot/IndexSchema_p.html, source/net/yacy/migration.java, source/net/yacy/search/index/ReindexSolrBusyThread.java |
Sun May 12 21:37:45 CEST 2013 by orbiter | prevent that concurrent deletion process causes wrong double-check in crawl start Changed Files: source/net/yacy/search/Switchboard.java |
Sat May 11 10:53:12 CEST 2013 by Michael Peter Christen | removed synchronization and concurrency in Fulltext class, concurrent deletions are now handled in ConcurrentUpdateSolrConnector Changed Files: htroot/CrawlResults.java, htroot/Crawler_p.java, htroot/IndexControlURLs_p.java, source/net/yacy/search/index/Fulltext.java |
Fri May 10 17:33:02 CEST 2013 by Michael Peter Christen | added new peer icons for Mentor peers and Mentee peers (not used yet) Changed Files: htroot/env/grafics/JuniorMentee.gif, htroot/env/grafics/SeniorMentor.gif |
Fri May 10 17:32:21 CEST 2013 by Michael Peter Christen | - added ssl configuration sign (a lock) to network statistic/table - fixed a bug in bitfield Changed Files: htroot/Network.html, htroot/Network.java, source/net/yacy/cora/document/ASCII.java, source/net/yacy/peers/Seed.java, source/net/yacy/search/Switchboard.java, source/net/yacy/utils/bitfield.java |
Fri May 10 13:49:46 CEST 2013 by Michael Peter Christen | added checkbox (near port) to switch on ssl support (https access) to the admin interface. Changed Files: htroot/ConfigBasic.html, htroot/ConfigBasic.java |
Fri May 10 12:02:31 CEST 2013 by orbiter | Added a default keystore for ssl encryption of the YaCy web interface. This will enable https-access to YaCy, but this feature is disabled by default using the new server.https=false attribute. This has two purposes: - make it easier for everyone to use https (just set server.https=true) - provide the basis for secure yacy-to-yacy communication in the future Changed Files: defaults/freeworldKeystore, defaults/yacy.init, htroot/Status.java, source/net/yacy/server/serverCore.java |
Fri May 10 05:54:07 CEST 2013 by reger | reduce SolrConnectorLogging setting (from default ALL to INFO) Changed Files: defaults/yacy.logging |
Fri May 10 04:56:58 CEST 2013 by Michael Peter Christen | fix for sitemap detection: the sitemap url was not visible if it appeared after the declaration of robots allow/deny for the crawler because the sitemap parser terminated after the allow/deny rules had been found. Now the parser reads the robots.txt until the end to discover also sitemap rules at the end of the file. Changed Files: htroot/api/getpageinfo_p.java, source/net/yacy/crawler/robots/RobotsTxtParser.java |
Fri May 10 04:38:13 CEST 2013 by reger | - fix monitor url of crawl job in PerformanceQueues_p.html - reduce logging of every index add (switch embeddedsolr.add from info to debug) Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/search/Switchboard.java |
Thu May 09 03:06:48 CEST 2013 by Michael Peter Christen | removed some unnecessary synchronizations Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java |
Wed May 08 23:45:29 CEST 2013 by Michael Peter Christen | merged classpath Bitte geben Sie eine Versionsbeschreibung für Ihre Änderungen ein. Zeilen, Changed Files: .classpath |
Wed May 08 16:48:45 CEST 2013 by orbiter | fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4652 generate dht data even if dht receive and dht transmission is switched off Changed Files: source/net/yacy/kelondro/blob/HeapModifier.java, source/net/yacy/search/Switchboard.java |
Wed May 08 15:17:06 CEST 2013 by orbiter | updated pdf parser Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/fontbox-1.8.1.License, lib/fontbox-1.8.1.jar, lib/jempbox-1.8.1.License, lib/jempbox-1.8.1.jar, lib/pdfbox-1.8.1.License, lib/pdfbox-1.8.1.jar |
Wed May 08 13:26:25 CEST 2013 by Michael Peter Christen | fixes to deletion methods (removed unnecessary concurrency and added removal of crawl queue entries) Changed Files: htroot/Crawler_p.java, htroot/HostBrowser.java, htroot/IndexDeletion_p.java, htroot/PerformanceMemory_p.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java |
Wed May 08 12:41:24 CEST 2013 by Michael Peter Christen | better robustness of Concurrent Solr Connector against update/deletion thread failure Changed Files: source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java |
Mon May 06 14:58:18 CEST 2013 by Michael Peter Christen | increased default proxy client timeout to one minute Changed Files: defaults/yacy.init, source/net/yacy/server/http/HTTPDProxyHandler.java |
Mon May 06 14:27:39 CEST 2013 by Michael Peter Christen | draw the names of other peers which receive/send dht into the network graphic Changed Files: htroot/Network.html, htroot/NetworkPicture.java, source/net/yacy/peers/graphics/NetworkGraph.java, source/net/yacy/visualization/PrintTool.java, source/net/yacy/visualization/RasterPlotter.java |
Sun May 05 23:39:46 CEST 2013 by Michael Peter Christen | enlarge network graph circle according to image height and reduce the image height in the Network servlet. Overall, the image is now larger but takes less space on the web page. Changed Files: htroot/Network.html, source/net/yacy/peers/graphics/NetworkGraph.java |
Sun May 05 05:00:42 CEST 2013 by reger | remove pre 1.0 migration statement which possibly overwrites user navigator setting Changed Files: source/net/yacy/migration.java |
Sat May 04 09:34:06 CEST 2013 by Michael Peter Christen | typo Changed Files: htroot/IndexDeletion_p.html |
Sat May 04 01:14:10 CEST 2013 by Michael Peter Christen | - added regular-expression based deletions - on-demand collection-list generation for collection-based deletions instead of a default collection-list presentation (this makes calling the interface much faster since the computation of collections lists for large indexes may take some seconds) Changed Files: htroot/IndexDeletion_p.html, htroot/IndexDeletion_p.java |
Sat May 04 00:14:22 CEST 2013 by Michael Peter Christen | abstraction of catchall term Changed Files: htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryGoal.java |
Sat May 04 00:14:00 CEST 2013 by Michael Peter Christen | added the date to error documents Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java |
Fri May 03 03:55:14 CEST 2013 by reger | adjust Test case EmbeddedSolrConnector Changed Files: test/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnectorTest.java |
Fri May 03 02:03:30 CEST 2013 by Michael Peter Christen | fix for solr cache when a delete buffer is filled and a document, which is the delete queue, is replaced with a new one. Changed Files: source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java |
Fri May 03 02:02:35 CEST 2013 by Michael Peter Christen | preventing score computation in solr where applicable Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java |
Fri May 03 00:24:39 CEST 2013 by orbiter | fix for http://bugs.yacy.net/view.php?id=233 - check geolocation coordinates and accept only those, which are well-formed - the solr push process does not stop crawling any more if after 20 requests to Solr Solr does not accept the record. Instead, a severe log entry asks the user to create a bug request Changed Files: source/net/yacy/document/Document.java, source/net/yacy/kelondro/data/meta/URIMetadataRow.java, source/net/yacy/search/index/Segment.java |
Thu May 02 15:47:21 CEST 2013 by sixcooler | fix for PerformanceMemory showing UNRESOLVED_PATTERN by removing solr-cache-stuff, which is not available anymore Changed Files: htroot/PerformanceMemory_p.html, htroot/PerformanceMemory_p.java |
Tue Apr 30 11:44:56 CEST 2013 by Michael Peter Christen | remove sort order in all cases where not needed Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java |
Tue Apr 30 11:09:21 CEST 2013 by Michael Peter Christen | prevent that long-running deletion tasks block a hard commit. Changed Files: source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java |
Tue Apr 30 02:11:28 CEST 2013 by Michael Peter Christen | - added index deletion to index administration submenu - added index deletion processes to the process scheduler/recorder Changed Files: htroot/IndexDeletion_p.html, htroot/IndexDeletion_p.java, htroot/env/templates/submenuIndexControl.template, source/net/yacy/data/WorkTables.java |
Tue Apr 30 00:03:21 CEST 2013 by Saransh Sharma | New Hindi Translation Changed Files: locales/hi.lng |
Mon Apr 29 19:30:53 CEST 2013 by Michael Peter Christen | added an index deletion servlet and some style changes for the 'dangerous' engage-button Changed Files: htroot/IndexDeletion_p.html, htroot/IndexDeletion_p.java, skins/pdblue.css |
Mon Apr 29 19:30:04 CEST 2013 by Michael Peter Christen | added another solr connector, the ConcurrentUpdateSolrConnector which does not block when long-running updates to solr are made. This is realized using blocking queues which process all long-running tasks in the background. Also some bugfixes to existing connectors. Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java |
Mon Apr 29 19:28:17 CEST 2013 by Michael Peter Christen | added more features to ScoreMap (pretty toString) Changed Files: source/net/yacy/cora/sorting/AbstractScoreMap.java, source/net/yacy/cora/sorting/ClusteredScoreMap.java, source/net/yacy/cora/sorting/OrderedScoreMap.java, source/net/yacy/cora/sorting/ScoreMap.java, source/net/yacy/server/serverObjects.java |
Sun Apr 28 21:20:14 CEST 2013 by Michael Peter Christen | - re-introduced existById in solr connector. - intruduced raw-queries for the re-introduced byId-Queries (they are hopefully faster than full edismax queries) - removed the cached solr connector (testing this) to rely only on the solr built-in search caches. That should save some RAM (also). We will see if this is usable. Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/search/index/Fulltext.java |
Sat Apr 27 03:11:44 CEST 2013 by reger | added httpstatus_i to automatically switched on fields (used in all search queries) Changed Files: source/net/yacy/search/Switchboard.java |
Fri Apr 26 02:26:38 CEST 2013 by reger | RinkingSolr_p: include warning if boost field not in local index Changed Files: htroot/RankingSolr_p.html, htroot/RankingSolr_p.java |
Wed Apr 24 01:14:35 CEST 2013 by Michael Peter Christen | added collection attribute also to the rss feed reader Changed Files: htroot/CrawlStartSite_p.html, htroot/Load_RSS_p.html, htroot/Load_RSS_p.java, source/net/yacy/cora/document/RSSMessage.java, source/net/yacy/crawler/retrieval/RSSLoader.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/schema/CollectionConfiguration.java |
Tue Apr 23 20:42:54 CEST 2013 by orbiter | added a 'collection' property attribute in yacysearch.html which can be used to select between different collections as defined during a crawl start with the 'collection' attribute. This actually implements the ability to prepare search tenants which restrict their search results to a specific collection. The main use for this is to provide tenants to the yaml4 interface (at this time). Changed Files: htroot/gsa/searchresult.java, htroot/yacysearch.java, source/net/yacy/cora/document/RSSMessage.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java |
Tue Apr 23 16:01:17 CEST 2013 by Saransh Sharma | More Translation Changed Files: locales/de.lng, locales/hi.lng |
Tue Apr 23 12:15:33 CEST 2013 by orbiter | increased row limitation for authorized users from 10000 to 100000000 in solr interface Changed Files: htroot/solr/select.java |
Mon Apr 22 22:33:13 CEST 2013 by Michael Peter Christen | extended limitation of dom export size from 100000 to 100000000 Changed Files: source/net/yacy/search/index/Fulltext.java |
Mon Apr 22 14:33:04 CEST 2013 by Michael Peter Christen | some extensions to raster plotter to transform a RGB picture to an indexed color scheme. This is needed for gif animations Changed Files: source/net/yacy/peers/graphics/NetworkGraph.java, source/net/yacy/visualization/RasterPlotter.java |
Sun Apr 21 12:29:05 CEST 2013 by Michael Peter Christen | added transparency to gif image animation and the integration to the YaCy httpd for on-the-fly generated gifs (including animated gifs) Changed Files: source/net/yacy/kelondro/util/ByteBuffer.java, source/net/yacy/peers/graphics/EncodedImage.java, source/net/yacy/server/http/HTTPDFileHandler.java, source/net/yacy/visualization/AnimationGIF.java |
Fri Apr 19 09:42:23 CEST 2013 by Saransh Sharma | Hello world Changed Files: locales/hi.lng |
Thu Apr 18 17:21:17 CEST 2013 by Michael Peter Christen | added new schema fields: hreflang_url_sxt and hreflang_cc_sxt for http://support.google.com/webmasters/bin/answer.py?hl=de&answer=189077 navigation_url_sxt and navigation_type_sxt for http://googlewebmastercentral.blogspot.de/2011/09/pagination-with-relnext-and-relprev.html publisher_url_s for http://support.google.com/plus/answer/1713826?hl=de all fields are disabled by default and not written to the index. Changed Files: defaults/solr.collection.schema, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Wed Apr 17 16:15:27 CEST 2013 by Michael Peter Christen | checking of document signature for a double-document check now refers only to documents within the same domain Changed Files: source/net/yacy/search/index/Segment.java |
Wed Apr 17 12:57:27 CEST 2013 by Michael Peter Christen | added hindi translation configuration Changed Files: htroot/ConfigBasic.html, source/net/yacy/data/Translator.java |
Wed Apr 17 11:11:55 CEST 2013 by Saransh Sharma | Hindi Some parts only Changed Files: locales/hi.lng |
Tue Apr 16 15:02:00 CEST 2013 by Michael Peter Christen | setting of new default values for ranking Changed Files: defaults/yacy.init, source/net/yacy/search/SwitchboardConstants.java |
Tue Apr 16 11:38:51 CEST 2013 by Michael Peter Christen | added in RankingSolr_p.html a select box to switch between different ranking situations. By default, four situations can be configured. Changed Files: htroot/RankingSolr_p.html, htroot/RankingSolr_p.java |
Tue Apr 16 01:35:15 CEST 2013 by Michael Peter Christen | added new solr title_exact_signature_l and description_exact_signature_l to be able to identify unique title and unique description fields. Changed Files: defaults/solr.collection.schema, source/net/yacy/cora/document/analysis/EnhancedTextProfileSignature.java, source/net/yacy/document/Condenser.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Sun Apr 14 20:52:40 CEST 2013 by Michael Peter Christen | added new field host_extent_i which, after a crawl and postprocessing, holds the number of documents for the host where the document is hosted. This is necessary for ranking and the norming of references per local host in the ranking computation. Changed Files: defaults/solr.collection.schema, defaults/yacy.init, htroot/IndexControlRWIs_p.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Sun Apr 14 11:30:57 CEST 2013 by Michael Peter Christen | showing now the details of references count in host browser: external (ext), internal (int) and external hosts (hosts) for each indexed document. Changed Files: htroot/HostBrowser.java |
Sun Apr 14 05:33:01 CEST 2013 by reger | add admin option to delete load errors from index Changed Files: htroot/HostBrowser.html, htroot/HostBrowser.java, htroot/HostBrowserAdmin_p.html |
Sat Apr 13 23:04:44 CEST 2013 by Marc Nause | *) did some long overdue refactoring Changed Files: source/net/yacy/repository/Blacklist.java |
Sat Apr 13 21:50:48 CEST 2013 by Marc Nause | *) fixed encoding of query in link to map (in case geolocalization is enabled, "Show search results for "köln" on map") *) applied suggestions of Checkstyle plugin Changed Files: htroot/yacysearchtrailer.java |
Fri Apr 12 16:17:14 CEST 2013 by Michael Peter Christen | added three new field for a better ranking: references_internal_i, references_external_i and references_exthosts_i. These can be used to count and evaluate the number of external links to every web page. An experimental ranking function can be i.e.: div(add(references_internal_i,product(references_external_i,references_exthosts_i)),add(clickdepth_i,1)) Changed Files: defaults/solr.collection.schema, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/kelondro/data/citation/CitationReference.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java |
Fri Apr 12 10:48:41 CEST 2013 by Michael Peter Christen | - setting the same default ranking in the solr interface as for YaCy search interfaces if no other ranking attributes are given - using the YaCy ranking in the GSA interface only if there was not given a GSA-style sort attribute - to avoid confusion about correct ranking attributes, only the default '0'-ranking profile is used and not scenario-adopted (site, date) because that should be configurable in the web interface before it is used actually for ranking. Changed Files: htroot/gsa/searchresult.java, htroot/solr/select.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java |
Thu Apr 11 15:07:08 CEST 2013 by Michael Peter Christen | resume paused crawls on startup; user expects that restarts 'heal' everything Changed Files: source/net/yacy/search/Switchboard.java |
Thu Apr 11 14:46:13 CEST 2013 by Michael Peter Christen | - showing references count and clickdepth in host browser - fixed generation and presentation of both values Changed Files: htroot/HostBrowser.html, htroot/HostBrowser.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/schema/CollectionConfiguration.java |
Tue Apr 09 18:55:26 CEST 2013 by orbiter | if the crawl was paused (automatically), show the reason for pausing in the Crawler_p servlet. Changed Files: htroot/Crawler_p.html, htroot/Crawler_p.java |
Mon Apr 08 21:25:21 CEST 2013 by reger | fix: Index Administration > Reverse Word Index (IndexControlRWIs_p) corrected use of word search to word-hash search - removed duplicate QueryParams.hashes2Handles , redundant with .hashes2Set Changed Files: htroot/IndexControlRWIs_p.java, htroot/yacy/search.java, source/net/yacy/search/query/QueryParams.java |
Sun Apr 07 10:36:05 CEST 2013 by Michael Peter Christen | added missing library after solr upgrade Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/lucene-codecs-4.2.1.jar |
Sat Apr 06 23:00:48 CEST 2013 by reger | adjust Netbeans IDE project.xml classpath for Solr 4.2.1 jars Changed Files: nbproject/project.xml |
Sat Apr 06 02:34:56 CEST 2013 by reger | comment out dead menue link Changed Files: htroot/env/templates/submenuIndexControl.template |
Sat Apr 06 02:08:01 CEST 2013 by reger | uncomment "used time" calculation for remote search log Changed Files: htroot/AccessTracker_p.html, htroot/AccessTracker_p.java |
Fri Apr 05 03:33:33 CEST 2013 by reger | improve remote search log, set "Returned Results" to transmitcount (instead of no value) Changed Files: htroot/AccessTracker_p.html, htroot/AccessTracker_p.java |
Thu Apr 04 00:40:59 CEST 2013 by reger | - fix opensearch discover err msg - webgraph not enabled - if no opensearchdescription link found in index - remove search2.net from sample config (is down) Changed Files: defaults/heuristicopensearch.conf, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java |
Mon Apr 01 03:51:57 CEST 2013 by reger | make sure configured port is reported on recreated mySeed.txt Changed Files: source/net/yacy/peers/Seed.java |
Tue Mar 19 11:23:18 CET 2013 by Michael Peter Christen | better search timing; prevents '0 results' for very large local indexes >> 10 mio documents Changed Files: htroot/yacysearchitem.java |
Tue Mar 19 10:33:35 CET 2013 by Michael Peter Christen | fix in GSA result writer which evaluates result context fields as String. After the migration to Solr 4.1.0 'some' of these fields suddenly are stored as String[]; this patch compensates this confusion. Changed Files: source/net/yacy/cora/federate/solr/responsewriter/GSAResponseWriter.java |
Tue Mar 19 10:32:01 CET 2013 by Michael Peter Christen | - callback fix - memory allocation problem in RowCollection: if memory is too low, do not to try to increase by 1 because this leads to very long execution time and at the end to the same OOM as if we allocate the memory at the moment we need it even if the resource observer states that this memory is not there. To compensate this, the increase size is reduced. Changed Files: htroot/portalsearch/yacy-portalsearch.js, source/net/yacy/kelondro/index/RowCollection.java |
Tue Mar 19 00:59:47 CET 2013 by orbiter | renamed callback function to 'callback' because that is a standard for jsonp which is also used in backbone.js/jquery Changed Files: source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java |
Sun Mar 17 22:13:56 CET 2013 by orbiter | increased number of links limitation from 1000 to 10000 for rss feeds and html documents Changed Files: defaults/solr.webgraph.schema, source/net/yacy/cora/document/RSSFeed.java, source/net/yacy/document/parser/htmlParser.java |
Sun Mar 17 11:43:12 CET 2013 by Frank | add the new PPMbar in Crawler_p for a better style and better use. Changed Files: htroot/Crawler_p.html, htroot/js/Crawler.js |
Sun Mar 17 10:52:31 CET 2013 by orbiter | enhanced did-you-mean (a bit): can now remember previously searched words (plus small enhancements) Changed Files: htroot/suggest.java, source/net/yacy/cora/document/WordCache.java, source/net/yacy/data/DidYouMean.java, source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryGoal.java |
Sun Mar 17 03:46:29 CET 2013 by reger | add: reset Solr schema filed selection to default button in IndexSchema_p Changed Files: htroot/IndexSchema_p.html, htroot/IndexSchema_p.java, source/net/yacy/search/Switchboard.java |