YaCy Release current_development

Major Changes   
Jump to: Bugfixes / Other Changes

CommitDescription
Tue Nov 24 21:35:58 CET 2015
by reger
upd to Jetty v9.2.14.v20151106
Changed Files: .classpath, build.xml, lib/jetty-9.2.14.v20151106.License, lib/jetty-client-9.2.14.v20151106.jar, lib/jetty-continuation-9.2.14.v20151106.jar, lib/jetty-deploy-9.2.14.v20151106.jar, lib/jetty-http-9.2.14.v20151106.jar, lib/jetty-io-9.2.14.v20151106.jar, lib/jetty-jmx-9.2.14.v20151106.jar, lib/jetty-proxy-9.2.14.v20151106.jar, lib/jetty-security-9.2.14.v20151106.jar, lib/jetty-server-9.2.14.v20151106.jar, lib/jetty-servlet-9.2.14.v20151106.jar, lib/jetty-servlets-9.2.14.v20151106.jar, lib/jetty-util-9.2.14.v20151106.jar, lib/jetty-webapp-9.2.14.v20151106.jar, lib/jetty-xml-9.2.14.v20151106.jar, nbproject/project.xml, pom.xml
Fri Nov 20 09:38:16 CET 2015
by luc
BMP and ICO image formats support : integrated /haraldk/TwelveMonkeys
imageio-bmp-3.2 library.

 - better BMP format flavours support
 - handle PNG encoded icons
 - handle transparency
 
Added some javadoc url references to .classpath
Changed Files: .classpath, build.xml, lib/imageio-bmp-3.2.jar, pom.xml, source/net/yacy/document/ImageParser.java, source/net/yacy/document/parser/images/bmpParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/icoParser.java
Tue Nov 17 22:27:07 CET 2015
by reger
avoid File.deleteOnExit() on temp files
JVM registers each file in a list regardless of already deleted and never
cleans up the list during runtime.
This accumulates to a considerable amount of mem during large crawls and/or
long uptime.
To tackle this, all temp files are now created in a subdir of java.io.tmpdir 
and the jvm tmpdir property is set to this subdir, which is deleted by
code on shutdown.
Additionally let pdfParser use this tmp subdir too.
Changed Files: source/net/yacy/document/parser/apkParser.java, source/net/yacy/document/parser/audioTagParser.java, source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/gzipParser.java, source/net/yacy/document/parser/odtParser.java, source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/kelondro/util/FileUtils.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/yacy.java
Sat Nov 14 01:46:25 CET 2015
by reger
upd to TwelveMonkeys ImageIO 3.2
Changed Files: .classpath, build.xml, lib/common-image-3.2.jar, lib/common-io-3.2.jar, lib/common-lang-3.2.jar, lib/imageio-core-3.2.jar, lib/imageio-metadata-3.2.jar, lib/imageio-tiff-3.2.jar, nbproject/project.xml, pom.xml
Thu Nov 05 09:45:19 CET 2015
by luc
Refactoring : default favicon and image processing errors.
 - moved default favicon processing from ViewImage to
yacysearchitem.html : when previewing ico image search results we don't
want a default favicon be displayed
 - throw an IOException ending in a HTTP 500 error when image processing
fails, rather than returning a null result : behavior is more consistent
accross browsers (for exempla Chrome and Firefox), especially with new
default favicon display system
Changed Files: htroot/ViewImage.java, htroot/env/base.css, htroot/env/oldie.css, htroot/yacy/ui/css/base.css, htroot/yacy/ui/css/widget.css, htroot/yacysearchitem.html
Sat Oct 31 23:09:03 CET 2015
by reger
differentiate api call getLocalPort() from getConfigInt()
Changed Files: htroot/ConfigBasic.java, htroot/ConfigPortal.java, htroot/ConfigSearchBox.java, htroot/CrawlStartScanner_p.java, htroot/Load_MediawikiWiki.java, htroot/Load_PHPBB3.java, htroot/SettingsAck_p.java, htroot/Settings_p.java, htroot/Table_API_p.java, htroot/api/push_p.java, htroot/opensearchdescription.java, htroot/yacysearch.java, htroot/yacysearch_location.java, source/net/yacy/gui/Tray.java, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/serverSwitch.java, source/net/yacy/yacy.java
Mon Oct 19 00:53:10 CEST 2015
by reger
upd to jetty 9.2.13.v20150730
Changed Files: .classpath, build.xml, lib/jetty-9.2.13.v20150730.License, lib/jetty-client-9.2.13.v20150730.jar, lib/jetty-continuation-9.2.13.v20150730.jar, lib/jetty-deploy-9.2.13.v20150730.jar, lib/jetty-http-9.2.13.v20150730.jar, lib/jetty-io-9.2.13.v20150730.jar, lib/jetty-jmx-9.2.13.v20150730.jar, lib/jetty-proxy-9.2.13.v20150730.jar, lib/jetty-security-9.2.13.v20150730.jar, lib/jetty-server-9.2.13.v20150730.jar, lib/jetty-servlet-9.2.13.v20150730.jar, lib/jetty-servlets-9.2.13.v20150730.jar, lib/jetty-util-9.2.13.v20150730.jar, lib/jetty-webapp-9.2.13.v20150730.jar, lib/jetty-xml-9.2.13.v20150730.jar, nbproject/project.xml, pom.xml
Sun Oct 18 19:53:39 CEST 2015
by reger
upd httpclient-4.5.1, httpmime-4.5.1, httpcore-4.4.3, commons-compress-1.10
Changed Files: .classpath, build.xml, lib/commons-compress-1.10.License, lib/commons-compress-1.10.jar, lib/httpclient-4.5.1.jar, lib/httpcore-4.4.3.License, lib/httpcore-4.4.3.jar, lib/httpmime-4.5.1.jar, nbproject/project.xml, pom.xml
Sun Oct 18 05:51:01 CEST 2015
by reger
implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
see freshly deprecated https://developers.google.com/webmasters/ajax-crawling/
Implementation improves parsing of the homepage (ajax page) which uses metatag "fragment" in header and parses supplied html snapshot instead of mostly empty ajax/scripted page.
Implementation supports also hash-bang urls (url with anchor starting with ! like  ...path#!hashfragment) but our crawler filters it
(use of hash-bang is controversly discussed and proposal is deprecated, makes no sense to adjust the crawler, but as long as it is used by some sites the minor change/improvement in htmlparser is good for some time).
Quick - how does it work
- if metatag fragment with content "!" is found
   - htmlparser tries to get content of htmls snapshot (using a different url)
   - htmlparser returns 2 documents (original url and snapshot content - but using same original url)
- after parsing result documents are joined (and stored to index containing content also from snapshot page... as the original ajax page contains typically no parseable html content)
Changed Files: source/net/yacy/document/parser/htmlParser.java
Thu Oct 15 10:06:51 CEST 2015
by luc
Integrated haraldk/TwelveMonkeys library to first add TIF image format
support.
Changed Files: .classpath, build.xml, htroot/ViewImage.java, lib/common-image-3.1.2.jar, lib/common-io-3.1.2.jar, lib/common-lang-3.1.2.jar, lib/imageio-core-3.1.2.jar, lib/imageio-metadata-3.1.2.jar, lib/imageio-tiff-3.1.2.jar, lib/servlet-3.1.2.jar, pom.xml, source/net/yacy/document/ImageParser.java
Sat Oct 03 23:20:33 CEST 2015
by reger
upd to solr/lucene 5.3.1
Changed Files: .classpath, build.xml, htroot/yacysearchtrailer.java, lib/lucene-analyzers-common-5.3.1.jar, lib/lucene-analyzers-phonetic-5.3.1.jar, lib/lucene-backward-codecs-5.3.1.jar, lib/lucene-classification-5.3.1.jar, lib/lucene-codecs-5.3.1.jar, lib/lucene-core-5.3.1.jar, lib/lucene-facet-5.3.1.jar, lib/lucene-grouping-5.3.1.jar, lib/lucene-highlighter-5.3.1.jar, lib/lucene-join-5.3.1.jar, lib/lucene-memory-5.3.1.jar, lib/lucene-misc-5.3.1.jar, lib/lucene-queries-5.3.1.jar, lib/lucene-queryparser-5.3.1.jar, lib/lucene-spatial-5.3.1.jar, lib/lucene-suggest-5.3.1.jar, lib/solr-core-5.3.1.jar, lib/solr-solrj-5.3.1.jar, nbproject/project.xml, pom.xml, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/crawler/HostQueue.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/server/serverObjects.java
Thu Sep 24 13:50:23 CEST 2015
by Michael Peter Christen
Merge pull request #14 from luccioman/master

Translator refactoring : no more regular expression processing
Changed Files: htroot/Network.html, htroot/Table_API_p.html, htroot/Table_RobotsTxt_p.html, htroot/YMarks.html, htroot/env/style.java, htroot/yacy/ui/yacyui-admin.html, locales/cn.lng, locales/de.lng, locales/fr.lng, locales/gr.lng, locales/hi.lng, locales/it.lng, locales/ru.lng, locales/sk.lng, locales/uk.lng, source/net/yacy/data/Translator.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/utils/translation/ExtensionsFileFilter.java, source/net/yacy/utils/translation/ListNonTranslatedFiles.java, source/net/yacy/utils/translation/TranslateAll.java, source/net/yacy/utils/translation/TranslateAllToOneLanguage.java, source/net/yacy/utils/translation/TranslatorUtil.java
Fri Sep 11 17:20:11 CEST 2015
by luccioman
Translator refactoring : to simplify locale files writing, process keys
as simple string and no more as regular expressions.
Updated all locale files to adapt to refectored Translator : removed
useless escaped characters and did minor corrections.
Performed minor syntax corrections on some html source files.
Added an util to translate all html source files with all locales
without launching full YaCy application.
Corrected main arguments parsing on other translation utils.
Changed Files: htroot/Network.html, htroot/Table_API_p.html, htroot/Table_RobotsTxt_p.html, htroot/YMarks.html, htroot/yacy/ui/yacyui-admin.html, locales/cn.lng, locales/de.lng, locales/fr.lng, locales/gr.lng, locales/hi.lng, locales/it.lng, locales/ru.lng, locales/sk.lng, locales/uk.lng, source/net/yacy/data/Translator.java, source/net/yacy/utils/translation/ExtensionsFileFilter.java, source/net/yacy/utils/translation/ListNonTranslatedFiles.java, source/net/yacy/utils/translation/TranslateAll.java, source/net/yacy/utils/translation/TranslateAllToOneLanguage.java, source/net/yacy/utils/translation/TranslatorUtil.java
Sat Sep 05 14:12:17 CEST 2015
by Michael Peter Christen
when many crawl queues are generated, this NPE can occur; probably
caused as concurrency issue:
W 2015/09/05 14:09:10 ConcurrentLog java.lang.NullPointerException
java.lang.NullPointerException
	at java.util.TreeMap.rotateRight(TreeMap.java:2239)
	at java.util.TreeMap.fixAfterInsertion(TreeMap.java:2271)
	at java.util.TreeMap.put(TreeMap.java:582)
	at net.yacy.kelondro.table.Table.<init>(Table.java:235)
	at net.yacy.crawler.HostQueue.openStack(HostQueue.java:229)
	at net.yacy.crawler.HostQueue.getStack(HostQueue.java:204)
	at net.yacy.crawler.HostQueue.push(HostQueue.java:397)
	at net.yacy.crawler.HostBalancer.push(HostBalancer.java:237)
	at net.yacy.crawler.data.NoticedURL.push(NoticedURL.java:184)
	at net.yacy.crawler.CrawlStacker.stackCrawl(CrawlStacker.java:355)
	at net.yacy.crawler.CrawlStacker.job(CrawlStacker.java:134)
	at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at
net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:101)
	at
net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:82)
	at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
	
Changed Files: source/net/yacy/kelondro/table/Table.java
Mon Aug 31 20:24:41 CEST 2015
by sixcooler
fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
moved and was not cleared anymore. This results in an huge fieldcache.
(http://lucene.apache.org/#highlights-of-the-lucene-release-include
https://issues.apache.org/jira/browse/LUCENE-5666)
Here I try to use DovValues where it is possible.
For this I used the Api-Scheme as new basis für the Solr-Schema.
This needs at least a complete optimization of the Solr-Index to get a
smaller FieldCache.
Everything that is indexed with these setting will not use the
Fieldcache at all.
Changed Files: defaults/solr/schema.xml, htroot/api/schema.java, htroot/api/schema.xml, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java
Mon Aug 10 14:27:44 CEST 2015
by Michael Peter Christen
added a new facet type based on a probabilistic classifier using
bayesian filters. This can be used to classify documents during
indexing-time using a pre-definied bayesian filter.

New wordings:
- a context is a class where different categories are possible. The
context name is equal to a facet name.
- a category is a facet type within a facet navigation. Each context
must have several categories, at least one custom name (things you want
to discover) and one with the exact name "negative".

To use this, you must do:
- for each context, you must create a directory within
DATA/CLASSIFICATION with the name of the context (the facet name)
- within each context directory, you must create text files with one
document each per line for every categroy. One of these categories MUST
have the name 'negative.txt'.

Then, each new document is classified to match within one of the given
categories for each context.
Changed Files: defaults/yacy.init, htroot/js/yacysearch.js, htroot/yacysearch.java, source/net/yacy/cora/language/synonyms/AutotaggingLibrary.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/document/Document.java, source/net/yacy/document/ProbabilisticClassifier.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/schema/CollectionConfiguration.java
Mon Aug 03 05:37:34 CEST 2015
by Michael Peter Christen
removed warnings
Changed Files: htroot/Bookmarks.java, htroot/ConfigHTCache_p.java, htroot/IndexControlRWIs_p.java, htroot/ViewFile.java, htroot/yacysearch.java, htroot/yacysearchtrailer.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/crawler/Balancer.java, source/net/yacy/crawler/RecrawlBusyThread.java, source/net/yacy/data/BookmarksDB.java, source/net/yacy/data/WorkTables.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/util/OS.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/ranking/ReferenceOrder.java, source/net/yacy/utils/cryptbig.java
Sun Aug 02 14:52:41 CEST 2015
by Michael Peter Christen
a collection of search query enhancements:
- fixed superfluous space in query field list
- fixed filter query logic
- removed look-ahead query which caused that each new search page
submitted two solr queries
- fixed random solr result orders in case that the solr score was equal:
this was then re-ordered by YaCy using the document hash which came from
the solr object and that appeared to be random. Now the hash of the url
is used and the score is additionally modified by the url length to
prevent that this particular case appears at all.
Changed Files: source/net/yacy/cora/federate/solr/Ranking.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Sat Aug 01 00:25:40 CEST 2015
by reger
upd to pdfbox-1.8.10
Changed Files: .classpath, build.xml, lib/fontbox-1.8.10.License, lib/fontbox-1.8.10.jar, lib/jempbox-1.8.10.License, lib/jempbox-1.8.10.jar, lib/pdfbox-1.8.10.License, lib/pdfbox-1.8.10.jar, nbproject/project.xml, pom.xml
Thu Jul 30 00:16:09 CEST 2015
by reger
upd to Solr-5.2.1
Changed Files: .classpath, build.xml, lib/lucene-analyzers-common-5.2.1.jar, lib/lucene-analyzers-phonetic-5.2.1.jar, lib/lucene-backward-codecs-5.2.1.jar, lib/lucene-classification-5.2.1.jar, lib/lucene-codecs-5.2.1.jar, lib/lucene-core-5.2.1.jar, lib/lucene-facet-5.2.1.jar, lib/lucene-grouping-5.2.1.jar, lib/lucene-highlighter-5.2.1.jar, lib/lucene-join-5.2.1.jar, lib/lucene-memory-5.2.1.jar, lib/lucene-misc-5.2.1.jar, lib/lucene-queries-5.2.1.jar, lib/lucene-queryparser-5.2.1.jar, lib/lucene-spatial-5.2.1.jar, lib/lucene-suggest-5.2.1.jar, lib/solr-core-5.2.1.jar, lib/solr-solrj-5.2.1.jar, nbproject/project.xml, pom.xml
Wed Jul 29 23:30:05 CEST 2015
by reger
move java environment parameter setting disabling SNI (Server Name Indicator) support for https connections from code to startup script allowing admin to ~easy/transparent alter the YaCy default FALSE setting.
Background: some user report problem with connecting/crawling some sites via https which require SNI support (by default switched off in YaCy). On the other hand systems not demanding SNI support are sometimes not properly configured and due to a bug/feature in java 1.7 connection is aborted. The later is more often the case, so the default is still fine. With the java start parameter expert user can no alter the startparameter to -Djsse.enableSNIExtension=true (java default) if they crawl more hosts requiring SNI support.
The alternative to let YaCy try both during https handshake (deep inside the httpclient) is not pursut at this time.
Changed Files: source/net/yacy/yacy.java, startYACY.bat, startYACY.sh, startYACY_debug.bat
Thu Jul 02 18:41:13 CEST 2015
by Ryszard Go?
Use language-detection library for increased accuracy
Changed Files: .classpath, build.xml, langdetect/af, langdetect/ar, langdetect/bg, langdetect/bn, langdetect/cs, langdetect/da, langdetect/de, langdetect/el, langdetect/en, langdetect/es, langdetect/et, langdetect/fa, langdetect/fi, langdetect/fr, langdetect/gu, langdetect/he, langdetect/hi, langdetect/hr, langdetect/hu, langdetect/id, langdetect/it, langdetect/ja, langdetect/kn, langdetect/ko, langdetect/lt, langdetect/lv, langdetect/mk, langdetect/ml, langdetect/mr, langdetect/ne, langdetect/nl, langdetect/no, langdetect/pa, langdetect/pl, langdetect/pt, langdetect/ro, langdetect/ru, langdetect/sk, langdetect/sl, langdetect/so, langdetect/sq, langdetect/sv, langdetect/sw, langdetect/ta, langdetect/te, langdetect/th, langdetect/tl, langdetect/tr, langdetect/uk, langdetect/ur, langdetect/vi, langdetect/zh-cn, langdetect/zh-tw, lib/jsonic-1.2.0.jar, lib/langdetect.jar, lib/langdetect.jar.License, source/net/yacy/document/language/Identificator.java
Wed Jun 24 01:55:51 CEST 2015
by Michael Peter Christen
migration to Solr 5.2: huge benefits - this is a lot faster!

This is a very complex migration: many classes had been renamed or
removed, dependencies changed and the solr index type is now aligned to
be a solr cloud repository.
Together with the Solr 5.2 library update, one other dependent library
had been updated as well: httpclient 4.4->4.4.1

Older indexes are migrated from 4_10 to 5_2. However, the new index
structure is more efficient and we recommend to re-index everything.
Please use the index export before you do the update to a large
surrogate xml file. After the update, start with an empty index and then
initialize this with your dump.
Changed Files: .classpath, build.xml, defaults/solr/schema.xml, defaults/solr/solr.xml, defaults/solr/solrconfig.xml, htroot/IndexControlURLs_p.java, lib/httpclient-4.4.1.License, lib/httpclient-4.4.1.jar, lib/httpcore-4.4.1.License, lib/httpmime-4.4.1.License, lib/httpmime-4.4.1.jar, lib/lucene-analyzers-common-5.2.0.jar, lib/lucene-analyzers-phonetic-5.2.0.jar, lib/lucene-backward-codecs-5.2.0.jar, lib/lucene-classification-5.2.0.jar, lib/lucene-codecs-5.2.0.jar, lib/lucene-core-5.2.0.jar, lib/lucene-facet-5.2.0.jar, lib/lucene-grouping-5.2.0.jar, lib/lucene-highlighter-5.2.0.jar, lib/lucene-join-5.2.0.jar, lib/lucene-memory-5.2.0.jar, lib/lucene-misc-5.2.0.jar, lib/lucene-queries-5.2.0.jar, lib/lucene-queryparser-5.2.0.jar, lib/lucene-spatial-5.2.0.jar, lib/lucene-suggest-5.2.0.jar, lib/noggit-0.6.jar, lib/solr-core-5.2.0.jar, lib/solr-solrj-5.2.0.jar, lib/stax2-api-3.1.4.jar, lib/woodstox-core-asl-4.4.1.jar, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSelection.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/federate/solr/instance/ServerMirror.java, source/net/yacy/cora/federate/solr/instance/ServerShard.java, source/net/yacy/cora/federate/solr/instance/ShardInstance.java, source/net/yacy/cora/federate/solr/instance/SolrInstance.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java
Sun Jun 07 20:09:27 CEST 2015
by sixcooler
Update to Jetty-9.2.11 - a bugfix-release that did not solve my
Problems, but does not harm anything
Changed Files: .classpath, build.xml, lib/jetty-9.2.11.v20150529.License, lib/jetty-client-9.2.11.v20150529.jar, lib/jetty-continuation-9.2.11.v20150529.jar, lib/jetty-deploy-9.2.11.v20150529.jar, lib/jetty-http-9.2.11.v20150529.jar, lib/jetty-io-9.2.11.v20150529.jar, lib/jetty-jmx-9.2.11.v20150529.jar, lib/jetty-proxy-9.2.11.v20150529.jar, lib/jetty-security-9.2.11.v20150529.jar, lib/jetty-server-9.2.11.v20150529.jar, lib/jetty-servlet-9.2.11.v20150529.jar, lib/jetty-servlets-9.2.11.v20150529.jar, lib/jetty-util-9.2.11.v20150529.jar, lib/jetty-webapp-9.2.11.v20150529.jar, lib/jetty-xml-9.2.11.v20150529.jar, nbproject/project.xml, pom.xml
Sat May 30 06:31:08 CEST 2015
by Michael Peter Christen
added intensity option to graphics
Changed Files: htroot/AccessPicture_p.java, htroot/WebStructurePicture_p.java, htroot/imagetest.java, htroot/osm.java, source/net/yacy/peers/graphics/Banner.java, source/net/yacy/peers/graphics/NetworkGraph.java, source/net/yacy/visualization/Captcha.java, source/net/yacy/visualization/ChartPlotter.java, source/net/yacy/visualization/DemoApplet.java, source/net/yacy/visualization/GraphPlotter.java, source/net/yacy/visualization/HexGridPlotter.java, source/net/yacy/visualization/PrintTool.java, source/net/yacy/visualization/RasterPlotter.java
Fri May 29 15:05:52 CEST 2015
by Michael Peter Christen
added a full solr export to the IndexControlURLs_p.html servlet. The
export function is also now the default export option. The export file
format for a full solr export is very similar to a solr search result
xml, only the <lst name="responseHeader"> tag is missing.

The exported xml has a special line termination feature: all documents
will be exported into a single line without any CR in between. That
means that every document is completely inside a single line. While this
is not readable at all for humans, it is very useful for linux line
processing scripts, like grep. Using grep it will be easy to select
single documents which match for a given pattern.

Such dumps shall be importable with the DATA/SURROGATE/in import
function, but that import is not yet adopted to the new file format.
Changed Files: htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, source/net/yacy/cora/util/CRIgnoreWriter.java, source/net/yacy/search/index/Fulltext.java
Sat May 23 02:06:39 CEST 2015
by reger
Init remote crawler on demand
If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread.
Deploy of the remoteCrawlJob deferred on activation of the option.
Changed Files: htroot/ConfigNetwork_p.java, htroot/RemoteCrawl_p.java, htroot/Status.java, htroot/api/status_p.java, htroot/yacy/crawlReceipt.java, htroot/yacy/urls.java, source/net/yacy/crawler/Balancer.java, source/net/yacy/crawler/HostBalancer.java, source/net/yacy/crawler/HostQueue.java, source/net/yacy/crawler/LegacyBalancer.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/NoticedURL.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java
Mon May 11 00:37:04 CEST 2015
by reger
refactor getBookmark 
to consistenly check existance by != null (w/o throwing exception on not found)

Changed Files: htroot/Bookmarks.java, htroot/api/bookmarks/get_bookmarks.java, htroot/api/bookmarks/get_folders.java, htroot/api/bookmarks/posts/all.java, htroot/api/bookmarks/posts/get.java, htroot/api/bookmarks/tags/addTag_p.java, htroot/api/bookmarks/xbel/xbel.java, htroot/api/ymarks/import_ymark.java, htroot/yacysearchitem.java, source/net/yacy/data/BookmarksDB.java
Thu Apr 23 06:36:57 CEST 2015
by Michael Peter Christen
update to bootstrap.css 3.3.4
Changed Files: htroot/env/bootstrap/css/bootstrap-theme.css, htroot/env/bootstrap/css/bootstrap-theme.css.map, htroot/env/bootstrap/css/bootstrap-theme.min.css, htroot/env/bootstrap/css/bootstrap.css, htroot/env/bootstrap/css/bootstrap.css.map, htroot/env/bootstrap/css/bootstrap.min.css, htroot/env/bootstrap/fonts/glyphicons-halflings-regular.eot, htroot/env/bootstrap/fonts/glyphicons-halflings-regular.svg, htroot/env/bootstrap/fonts/glyphicons-halflings-regular.ttf, htroot/env/bootstrap/fonts/glyphicons-halflings-regular.woff, htroot/env/bootstrap/js/bootstrap.js, htroot/env/bootstrap/js/bootstrap.min.js, htroot/env/templates/header.template
Wed Apr 15 13:17:23 CEST 2015
by Michael Peter Christen
enhanced timezone managament for indexed data:
to support the new time parser and search functions in YaCy a high
precision detection of date and time on the day is necessary. That
requires that the time zone of the document content and the time zone of
the user, doing a search, is detected. The time zone of the search
request is done automatically using the browsers time zone offset which
is delivered to the search request automatically and invisible to the
user. The time zone for the content of web pages cannot be detected
automatically and must be an attribute of crawl starts. The advanced
crawl start now provides an input field to set the time zone in minutes
as an offset number. All parsers must get a time zone offset passed, so
this required the change of the parser java api. A lot of other changes
had been made which corrects the wrong handling of dates in YaCy which
was to add a correction based on the time zone of the server. Now no
correction is added and all dates in YaCy are UTC/GMT time zone, a
normalized time zone for all peers.
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartSite.html, htroot/Crawler_p.java, htroot/HostBrowser.java, htroot/IndexControlRWIs_p.java, htroot/NetworkHistory.java, htroot/QuickCrawlLink_p.java, htroot/api/bookmarks/posts/get.java, htroot/api/push_p.java, htroot/api/timeline_p.java, htroot/index.html, htroot/rct_p.java, htroot/yacy/search.java, htroot/yacy/transferURL.java, htroot/yacysearch.html, htroot/yacysearch.java, htroot/yacysearchtrailer.java, source/net/yacy/cora/date/AbstractFormatter.java, source/net/yacy/cora/date/DateFormatter.java, source/net/yacy/cora/date/GenericFormatter.java, source/net/yacy/cora/date/ISO8601Formatter.java, source/net/yacy/cora/document/feed/RSSMessage.java, source/net/yacy/cora/federate/FederateSearchManager.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/retrieval/Request.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/crawler/retrieval/SitemapImporter.java, source/net/yacy/data/BlogBoard.java, source/net/yacy/data/BookmarkHelper.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/DateDetection.java, source/net/yacy/document/Parser.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/importer/MediawikiImporter.java, source/net/yacy/document/importer/ResumptionToken.java, source/net/yacy/document/parser/apkParser.java, source/net/yacy/document/parser/audioTagParser.java, source/net/yacy/document/parser/augment/AugmentParser.java, source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/csvParser.java, source/net/yacy/document/parser/docParser.java, source/net/yacy/document/parser/dwgParser.java, source/net/yacy/document/parser/genericParser.java, source/net/yacy/document/parser/gzipParser.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/ScraperInputStream.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/metadataImageParser.java, source/net/yacy/document/parser/linkScraperParser.java, source/net/yacy/document/parser/mmParser.java, source/net/yacy/document/parser/odtParser.java, source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/document/parser/pptParser.java, source/net/yacy/document/parser/psParser.java, source/net/yacy/document/parser/rdfParser.java, source/net/yacy/document/parser/rdfa/impl/RDFaParser.java, source/net/yacy/document/parser/rssParser.java, source/net/yacy/document/parser/rtfParser.java, source/net/yacy/document/parser/sevenzipParser.java, source/net/yacy/document/parser/sidAudioParser.java, source/net/yacy/document/parser/sitemapParser.java, source/net/yacy/document/parser/swfParser.java, source/net/yacy/document/parser/tarParser.java, source/net/yacy/document/parser/torrentParser.java, source/net/yacy/document/parser/vcfParser.java, source/net/yacy/document/parser/vsdParser.java, source/net/yacy/document/parser/xlsParser.java, source/net/yacy/document/parser/zipParser.java, source/net/yacy/http/ProxyCacheHandler.java, source/net/yacy/http/ProxyHandler.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/kelondro/blob/ArrayStack.java, source/net/yacy/kelondro/blob/BEncodedHeapBag.java, source/net/yacy/kelondro/blob/Tables.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/table/SplitTable.java, source/net/yacy/peers/NewsDB.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/graphics/WebStructureGraph.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/EventTracker.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/AccessTracker.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Sun Apr 05 23:38:14 CEST 2015
by reger
upd to PDFBox 1.8.9
Changed Files: .classpath, build.xml, lib/fontbox-1.8.9.License, lib/fontbox-1.8.9.jar, lib/jempbox-1.8.9.License, lib/jempbox-1.8.9.jar, lib/pdfbox-1.8.9.License, lib/pdfbox-1.8.9.jar, nbproject/project.xml, pom.xml
Sun Mar 22 02:47:12 CET 2015
by reger
upd to Jetty 9.2.10
Changed Files: .classpath, build.xml, lib/jetty-9.2.10.v20150310.License, lib/jetty-client-9.2.10.v20150310.jar, lib/jetty-continuation-9.2.10.v20150310.jar, lib/jetty-deploy-9.2.10.v20150310.jar, lib/jetty-http-9.2.10.v20150310.jar, lib/jetty-io-9.2.10.v20150310.jar, lib/jetty-jmx-9.2.10.v20150310.jar, lib/jetty-proxy-9.2.10.v20150310.jar, lib/jetty-security-9.2.10.v20150310.jar, lib/jetty-server-9.2.10.v20150310.jar, lib/jetty-servlet-9.2.10.v20150310.jar, lib/jetty-servlets-9.2.10.v20150310.jar, lib/jetty-util-9.2.10.v20150310.jar, lib/jetty-webapp-9.2.10.v20150310.jar, lib/jetty-xml-9.2.10.v20150310.jar, nbproject/project.xml, pom.xml
Mon Mar 02 18:00:20 CET 2015
by Michael Peter Christen
Show dates in the content of a document in the search result:
- if an eventDate is given in the search result, replace the document
date with the event date and prefix it with the string "on ".
- the document date is omitted if a date from the cent is shown

Added also the date as fields in the json and rss result sets.
Changed Files: htroot/yacysearch.rss, htroot/yacysearchitem.html, htroot/yacysearchitem.java, htroot/yacysearchitem.json, htroot/yacysearchitem.xml, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/search/snippet/ResultEntry.java
Mon Mar 02 04:30:10 CET 2015
by Michael Peter Christen
added a new way of content browsing in search results:
- date navigation

The date is taken from the CONTENT of the documents / web pages, NOT
from a date submitted in the context of metadata (i.e. http header or
html head form). This makes it possible to search for documents in the
future, i.e. when documents contain event descriptions for future
events.

The date is written to an index field which is now enabled by default.
All documents are scanned for contained date mentions.
To visualize the dates for a specific search results, a histogram
showing the number of documents for each day is displayed. To render
these histograms the morris.js library is used. Morris.js requires also
raphael.js which is now also integrated in YaCy.

The histogram is now also displayed in the index browser by default.

To select a specific range from a search result, the following modifiers
had been introduced:
from:<date>
to:<date>
These modifiers can be used separately (i.e. only 'from' or only 'to')
to describe an open interval or combined to have a closed interval. Both
dates are inclusive. To select a specific single date only, use the
'to:' - modifier.

The histogram shows blue and green lines; the green lines denot weekend
days (saturday and sunday).

Clicking on bars in the histogram has the following reaction:
1st click: add a from:<date> modifier for the date of the bar
2nd click: add a to:<date> modifier for the date of the bar
3rd click: remove from and date modifier and set a on:<date> for the bar
When the on:<date> modifier is used, the histogram shows an unlimited
time period. This makes it possible to click again (4th click) which is
then interpreted as a 1st click again (sets a from modifier).

The display feature is NOT switched on by default; to switch it on use
the /ConfigSearchPage_p.html servlet.
Changed Files: defaults/solr.collection.schema, defaults/yacy.init, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/CrawlStartExpert.java, htroot/HostBrowser.html, htroot/HostBrowser.java, htroot/WebStructurePicture_p.java, htroot/env/morris.css, htroot/index.html, htroot/index.java, htroot/js/morris.js, htroot/js/raphael-min.js, htroot/yacysearch.html, htroot/yacysearchtrailer.html, htroot/yacysearchtrailer.java, htroot/yacysearchtrailer.json, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/cora/sorting/ClusteredScoreMap.java, source/net/yacy/crawler/HostQueue.java, source/net/yacy/crawler/data/ResultURLs.java, source/net/yacy/data/DidYouMean.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/DateDetection.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/Evaluation.java, source/net/yacy/peers/Seed.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java
Sat Feb 28 15:46:46 CET 2015
by Michael Peter Christen
Fix for Jetty "JetLeak" bug: update to jetty 9.2.9
The bug was inside the jetty library, for details see:
http://blog.gdssecurity.com/labs/2015/2/25/jetleak-vulnerability-remote-leakage-of-shared-buffers-in-je.html
We recommend to update your YaCy peer with this bugfix.
Changed Files: .classpath, build.xml, lib/jetty-9.2.9.v20150224.License, lib/jetty-client-9.2.9.v20150224.jar, lib/jetty-continuation-9.2.9.v20150224.jar, lib/jetty-deploy-9.2.9.v20150224.jar, lib/jetty-http-9.2.9.v20150224.jar, lib/jetty-io-9.2.9.v20150224.jar, lib/jetty-jmx-9.2.9.v20150224.jar, lib/jetty-proxy-9.2.9.v20150224.jar, lib/jetty-security-9.2.9.v20150224.jar, lib/jetty-server-9.2.9.v20150224.jar, lib/jetty-servlet-9.2.9.v20150224.jar, lib/jetty-servlets-9.2.9.v20150224.jar, lib/jetty-util-9.2.9.v20150224.jar, lib/jetty-webapp-9.2.9.v20150224.jar, lib/jetty-xml-9.2.9.v20150224.jar
Mon Feb 23 22:54:49 CET 2015
by Marc Nause
Changes to improve compatibility with OpenBSD. (see
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5503)
Changed Files: bin/addrss.sh, bin/apicall.sh, bin/apicat.sh, bin/checkalive.sh, bin/checkindex.sh, bin/clearall.sh, bin/clearapi.sh, bin/clearcache.sh, bin/clearindex.sh, bin/deleteurl.sh, bin/deploy.sh, bin/down.sh, bin/dumpcheck.sh, bin/importmediawiki.sh, bin/importurllist.sh, bin/indexdump.sh, bin/indexrestore.sh, bin/myip.sh, bin/passwd.sh, bin/search.sh, bin/search1.sh, bin/searchall.sh, bin/up.sh, startYACY.sh
Sat Feb 07 00:44:09 CET 2015
by reger
upd to Jetty 9.2.7
Changed Files: build.xml, lib/jetty-9.2.7.v20150116.License, lib/jetty-client-9.2.7.v20150116.jar, lib/jetty-continuation-9.2.7.v20150116.jar, lib/jetty-deploy-9.2.7.v20150116.jar, lib/jetty-http-9.2.7.v20150116.jar, lib/jetty-io-9.2.7.v20150116.jar, lib/jetty-jmx-9.2.7.v20150116.jar, lib/jetty-proxy-9.2.7.v20150116.jar, lib/jetty-security-9.2.7.v20150116.jar, lib/jetty-server-9.2.7.v20150116.jar, lib/jetty-servlet-9.2.7.v20150116.jar, lib/jetty-servlets-9.2.7.v20150116.jar, lib/jetty-util-9.2.7.v20150116.jar, lib/jetty-webapp-9.2.7.v20150116.jar, lib/jetty-xml-9.2.7.v20150116.jar, nbproject/project.xml, pom.xml
Fri Jan 30 13:20:56 CET 2015
by Michael Peter Christen
added a html field scraper which reads text from html entities of a
given css class and extends a given vocabulary with a term consisting
with the text content of the html class tag. Additionally, the term is
included into the semantic facet of the document. This allows the
creation of faceted search to documents without the pre-creation of
vocabularies; instead, the vocabulary is created on-the-fly, possibly
for use in other crawls. If any of the term scraping for a specific
vocabulary is successful on a document, this vocabulary is excluded for
auto-annotation on the page.

To use this feature, do the following:
- create a vocabulary on /Vocabulary_p.html (if not existent)
- in /CrawlStartExpert.html you will now see the vocabularies as column
in a table. The second column provides text fields where you can name
the class of html entities where the literal of the corresponding
vocabulary shall be scraped out
- when doing a search, you will see the content of the scraped fields in
a navigation facet for the given vocabulary
Changed Files: htroot/ConfigHeuristics_p.java, htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java, htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, htroot/osm.java, source/net/yacy/cora/language/synonyms/AutotaggingLibrary.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/data/BookmarkHelper.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/Parser.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/VocabularyScraper.java, source/net/yacy/document/importer/MediawikiImporter.java, source/net/yacy/document/parser/apkParser.java, source/net/yacy/document/parser/audioTagParser.java, source/net/yacy/document/parser/augment/AugmentParser.java, source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/csvParser.java, source/net/yacy/document/parser/docParser.java, source/net/yacy/document/parser/dwgParser.java, source/net/yacy/document/parser/genericParser.java, source/net/yacy/document/parser/gzipParser.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/ScraperInputStream.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/metadataImageParser.java, source/net/yacy/document/parser/linkScraperParser.java, source/net/yacy/document/parser/mmParser.java, source/net/yacy/document/parser/odtParser.java, source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/document/parser/pptParser.java, source/net/yacy/document/parser/psParser.java, source/net/yacy/document/parser/rdfParser.java, source/net/yacy/document/parser/rdfa/impl/RDFaParser.java, source/net/yacy/document/parser/rssParser.java, source/net/yacy/document/parser/rtfParser.java, source/net/yacy/document/parser/sevenzipParser.java, source/net/yacy/document/parser/sidAudioParser.java, source/net/yacy/document/parser/sitemapParser.java, source/net/yacy/document/parser/swfParser.java, source/net/yacy/document/parser/tarParser.java, source/net/yacy/document/parser/torrentParser.java, source/net/yacy/document/parser/vcfParser.java, source/net/yacy/document/parser/vsdParser.java, source/net/yacy/document/parser/xlsParser.java, source/net/yacy/document/parser/zipParser.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java
Thu Jan 29 02:28:03 CET 2015
by Michael Peter Christen
replace with CommonPattern.SPACE for split
Changed Files: htroot/BlogComments.java, htroot/ViewFile.java, htroot/Vocabulary_p.java, htroot/processing/domaingraph/applet/domaingraph.java, htroot/yacy/message.java, htroot/yacysearch_location.java, source/net/yacy/data/DidYouMean.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/search/schema/CollectionConfiguration.java, test/net/yacy/search/snippet/TextSnippetTest.java
Thu Jan 29 01:46:22 CET 2015
by Michael Peter Christen
applying precompiled CommonPattern.COMMA.split to all places where
split(",") was used
Changed Files: htroot/ConfigNetwork_p.java, htroot/IndexControlRWIs_p.java, htroot/SettingsAck_p.java, htroot/ViewFile.java, htroot/WebStructurePicture_p.java, htroot/api/feed.java, htroot/api/table_p.java, htroot/api/timeline_p.java, htroot/yacy/list.java, htroot/yacysearch.java, source/net/yacy/contentcontrol/SMWListSyncThread.java, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/geo/GeonamesLocation.java, source/net/yacy/cora/geo/OpenGeoDBLocation.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/cora/storage/Files.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/data/BookmarksDB.java, source/net/yacy/data/ListManager.java, source/net/yacy/data/Translator.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/csvParser.java, source/net/yacy/document/parser/docParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/pptParser.java, source/net/yacy/migration.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/repository/BlacklistFile.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/ranking/RankingProfile.java, source/net/yacy/server/http/HTTPDFileHandler.java
Fri Jan 23 11:30:13 CET 2015
by Michael Peter Christen
replaced old JavaApplicationStub for Mac Application framework with new
script. Adopted the YaCyApp environment and fixed a problem in the
startYACY.sh application wrapper which caused wrong usage of logging
option -l which caused that files had been written to the YaCy
application folder.
As a result of this fix, it is not necessary any more to change path
settings in Info.plist if libraries are changed.
Changed Files: addon/YaCy.app/Contents/Info.plist, addon/YaCy.app/Contents/MacOS/JavaApplicationStub, build.xml, source/net/yacy/gui/Toolkits.java, source/net/yacy/gui/Tray.java, source/net/yacy/gui/YaCyApp.java, source/net/yacy/gui/framework/Application.java, startYACY.sh


Bugfixes   
Jump to: YaCy Release current_development top / Other Changes

CommitDescription
Thu Dec 03 00:33:13 CET 2015
by Michael Peter Christen
fix for npe
Changed Files: source/net/yacy/search/index/Segment.java
Tue Dec 01 00:06:50 CET 2015
by reger
prevent exception on repeated ViewImage with same urlLicense
Changed Files: htroot/ViewImage.java
Mon Nov 30 13:19:49 CET 2015
by Michael Peter Christen
fixed classpath
Changed Files: .classpath
Fri Nov 20 10:15:54 CET 2015
by luc
Corrected IcedTea version. See http://mantis.tokeek.de/view.php?id=615
Changed Files: readme.mediawiki
Thu Nov 19 21:08:00 CET 2015
by reger
fix link to logo (yacysearch.xsl)
Changed Files: htroot/yacysearch.xsl
Thu Nov 05 09:40:24 CET 2015
by luc
Ensure closing of InputStream even when an exception occurs.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Sat Oct 31 22:53:59 CET 2015
by reger
fix detection of https port changed after set in System Admin
Changed Files: htroot/SettingsAck_p.java
Sun Oct 25 19:38:42 CET 2015
by reger
fix typo in WikiCode coordinate calculation
Changed Files: source/net/yacy/data/wiki/WikiCode.java
Wed Oct 14 15:16:16 CEST 2015
by Michael Peter Christen
fix for image size field values (must be multi-valued)
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sun Oct 11 06:06:40 CEST 2015
by reger
fix link target on iframe list in CrawlProfileEditor
Changed Files: htroot/Table_API_p.html
Thu Sep 24 13:53:54 CEST 2015
by Michael Peter Christen
fix for latest merge
Changed Files: htroot/env/style.java
Sat Sep 12 20:07:43 CEST 2015
by reger
fix test methode (add throw for URIMetadataNode)
Changed Files: test/net/yacy/search/snippet/TextSnippetTest.java
Wed Sep 02 02:36:31 CEST 2015
by reger
fix init of error cache, use latest faildates => load_date_dt
Changed Files: source/net/yacy/search/index/ErrorCache.java, source/net/yacy/search/schema/CollectionSchema.java
Sun Aug 23 23:01:20 CEST 2015
by reger
avoid runtime exception by earlier testing for seed.ip=null
Changed Files: source/net/yacy/peers/Protocol.java
Tue Aug 11 00:42:26 CEST 2015
by Michael Peter Christen
fix for filesystem crawl
Changed Files: source/net/yacy/crawler/data/CrawlProfile.java
Mon Aug 03 05:15:34 CEST 2015
by Michael Peter Christen
revert of fq transformation (recent fix)
Changed Files: source/net/yacy/search/query/QueryParams.java
Fri Jul 10 17:34:29 CEST 2015
by Michael Peter Christen
more fixes for special windows paths
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Fri Jul 10 17:14:14 CEST 2015
by Michael Peter Christen
patch for bad windows file paths
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Mon Jun 01 01:56:09 CEST 2015
by Michael Peter Christen
fix for index import
Changed Files: source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/search/Switchboard.java
Thu May 28 17:43:52 CEST 2015
by Michael Peter Christen
fix for unresolved pattern
Changed Files: htroot/CrawlProfileEditor_p.java
Fri May 22 11:15:53 CEST 2015
by Michael Peter Christen
fix for division by zero
Changed Files: htroot/yacysearchtrailer.java
Mon May 11 14:46:09 CEST 2015
by Michael Peter Christen
disabled debug thread dumps
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Sun May 10 05:18:23 CEST 2015
by reger
fix NPE in addToIndex when used outside searchEvent
Changed Files: source/net/yacy/search/Switchboard.java
Fri May 08 15:31:01 CEST 2015
by Michael Peter Christen
added temporary debug output in http client
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Thu May 07 03:25:19 CEST 2015
by reger
precaution against NPE on createorgetBookmark on search result
Changed Files: htroot/yacysearch.java
Sun Apr 12 01:11:47 CEST 2015
by reger
fix typecast for css links
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sat Apr 11 12:30:29 CEST 2015
by Michael Peter Christen
fix for filter queries
Changed Files: source/net/yacy/search/query/QueryParams.java
Sat Apr 11 12:20:29 CEST 2015
by Michael Peter Christen
fix for not valid json in case that topics are switched off
Changed Files: htroot/yacysearchtrailer.json
Thu Apr 09 14:21:23 CEST 2015
by Michael Peter Christen
logging fix
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Tue Apr 07 17:02:02 CEST 2015
by Michael Peter Christen
better debugging of fq
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Tue Mar 31 02:20:13 CEST 2015
by reger
fix NPE in location search on missing/empty PubDate in underlaying rss data
Changed Files: htroot/yacysearch_location.java, source/net/yacy/cora/document/feed/RSSMessage.java
Sat Mar 28 21:12:00 CET 2015
by reger
shorten exception loggin on not available connection in  Load_RSS_p servlet
Changed Files: htroot/Load_RSS_p.java
Thu Mar 26 00:21:31 CET 2015
by reger
fix NPE on viewfile of url not in index
Changed Files: htroot/ViewFile.java
Wed Mar 25 13:21:36 CET 2015
by Michael Peter Christen
fix: banner did not show link and qph for portal mode
Changed Files: htroot/Banner.java
Wed Mar 18 22:04:03 CET 2015
by reger
fix get fresh_date_dt  to allow returned value to be date in future
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Fri Mar 13 02:02:53 CET 2015
by reger
fix MultiProtocolURL mailto protocol detection
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Wed Mar 11 21:28:57 CET 2015
by reger
fix link to DeReWo project page
Changed Files: htroot/DictionaryLoader_p.html
Wed Mar 11 20:02:23 CET 2015
by reger
fix link to DeReWo download file
Changed Files: source/net/yacy/document/LibraryProvider.java
Mon Mar 02 04:43:42 CET 2015
by Michael Peter Christen
bugfix for fixed host/port
Changed Files: htroot/ConfigSearchPage_p.html, htroot/HostBrowser.html
Sun Feb 15 05:30:14 CET 2015
by reger
fix NPE in snippet computation
Changed Files: source/net/yacy/search/snippet/TextSnippet.java
Sat Feb 14 01:38:20 CET 2015
by Michael Peter Christen
fixed httpclient lib paths in ant build
Changed Files: build.xml
Tue Feb 10 08:33:30 CET 2015
by Michael Peter Christen
npe fix for latest scraper feature
Changed Files: source/net/yacy/document/Condenser.java
Sun Feb 08 00:15:30 CET 2015
by reger
fix: searchoption hint for heuristic
Changed Files: htroot/index.html, htroot/index.java
Wed Feb 04 15:03:34 CET 2015
by Michael Peter Christen
IPv6 Fix for push interface
Changed Files: htroot/api/push_p.java
Wed Feb 04 11:55:27 CET 2015
by Michael Peter Christen
fix for failed selection of terms in faceted search with vocabularies
Changed Files: htroot/yacysearchtrailer.java
Wed Jan 28 17:45:25 CET 2015
by Michael Peter Christen
fix for wkhtmltopdf (custom header does not work)
Changed Files: source/net/yacy/cora/util/Html2Image.java
Wed Jan 28 13:40:41 CET 2015
by Michael Peter Christen
fix for urlmaskfilter
Changed Files: htroot/yacysearch.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Tue Jan 27 16:24:27 CET 2015
by Michael Peter Christen
fix for vocabulary on/off setting
Changed Files: htroot/Vocabulary_p.java, htroot/api/snapshot.java, source/net/yacy/cora/lod/vocabulary/Tagging.java
Fri Jan 23 18:34:38 CET 2015
by Michael Peter Christen
fix for shell script
Changed Files: startYACY.sh


Other Changes   
Jump to: YaCy Release current_development top / Bugfixes

CommitDescription
Thu Dec 03 00:39:15 CET 2015
by Michael Peter Christen
less logging in new language detection
Changed Files: source/net/yacy/document/language/Identificator.java
Wed Dec 02 22:57:59 CET 2015
by reger
Apply collection query constraint/modifiert to rwi result stack.
Collection is not available in pure rwi entries (but in local solr metadata)
But if user wishes to filter by query constraint also rwi shall adhere to this
(even if only rwi entries with parsed or solr received metadata may fit)
 
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Dec 01 14:39:59 CET 2015
by Michael Peter Christen
Merge pull request #29 from luccioman/master

Modified images render error management
Changed Files: htroot/ViewImage.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Tue Dec 01 09:55:47 CET 2015
by luc
Corrected return type when licence is gone to be consistent with other
error cases.
Changed Files: htroot/ViewImage.java
Tue Dec 01 01:06:01 CET 2015
by luc
Corrected error management for unsupported image formats, parsing
errors, and unavailable resources : avoid logging to much Exceptions as
these errors easily occur when searching images.
Changed Files: htroot/ViewImage.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Mon Nov 30 18:57:16 CET 2015
by reger
upd maven pom (add langdetect)
Changed Files: pom.xml
Mon Nov 30 13:45:25 CET 2015
by Michael Peter Christen
Merge pull request #27 from Stepanov-Sergey/master

added Russian synonyms
Changed Files: addon/synonyms/thesaurus_ru_yacy
Mon Nov 30 13:35:41 CET 2015
by Michael Peter Christen
urlproxyheader must be in the default package because all classes in the
htroot path must be in the default package
Changed Files: htroot/proxymsg/urlproxyheader.java
Mon Nov 30 13:18:03 CET 2015
by Michael Peter Christen
Merge pull request #23 from linkerlin/patch-1

Create .travis.yml
Changed Files: .travis.yml
Mon Nov 30 09:37:47 CET 2015
by Sergey Stepanov
added Russian synonyms
Changed Files: addon/synonyms/thesaurus_ru_yacy
Sun Nov 29 05:19:39 CET 2015
by reger
read/init crawl queue in a thread
to speed-up YaCy start on large existing crawler queues
Changed Files: source/net/yacy/crawler/HostBalancer.java
Sun Nov 29 01:24:46 CET 2015
by reger
upd to slf4j-1.7.13
Changed Files: .classpath, build.xml, lib/jcl-over-slf4j-1.7.13.jar, lib/log4j-over-slf4j-1.7.13.jar, lib/slf4j-LICENSE.txt, lib/slf4j-api-1.7.13.jar, lib/slf4j-jdk14-1.7.13.jar, nbproject/project.xml, pom.xml
Sat Nov 28 23:09:15 CET 2015
by reger
remove unused md5 from ViewFile servlet params
Changed Files: htroot/ViewFile.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Sat Nov 28 02:43:38 CET 2015
by reger
fix vsdParser (Visio) parser return statement
(final block un-necessary throw)
Changed Files: source/net/yacy/document/parser/vsdParser.java
Fri Nov 27 02:41:02 CET 2015
by reger
remove md5_s from default index fields
it is not assigned a value / not used
Due to above also excluded from transfer protocol.
Changed Files: defaults/solr.collection.schema, source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Thu Nov 26 09:30:43 CET 2015
by luc
- No max dimensions specified : render raw image data when source and
target image format are the same.
- Corrected scaling condition.
Changed Files: htroot/ViewImage.java, source/net/yacy/peers/graphics/EncodedImage.java
Wed Nov 25 01:34:41 CET 2015
by reger
fix flux factor (additional crawl delay by access count) calculation
Changed Files: source/net/yacy/crawler/data/Latency.java
Sun Nov 22 21:26:18 CET 2015
by reger
throw exception if crawler hostqueue can't create hostpath directory.
In rare cases hostname may not be a valid filesystem directory name,
which can't be created (e.g. containing '*' char). To prevent crawl queue
looping on this invalid entry by throwing a malformedurlexception.
Changed Files: source/net/yacy/crawler/HostQueue.java
Fri Nov 20 19:35:39 CET 2015
by luc
Use same max file size when loading all resource bytes or opening stream
content
Changed Files: source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/crawler/retrieval/Response.java
Fri Nov 20 15:02:58 CET 2015
by luc
Rendering performance improvement : use EncodedImage constructor with
BufferedImage parameter to avoid re-rerendering BufferedImage.
Changed Files: htroot/ViewImage.java
Fri Nov 20 14:35:36 CET 2015
by luc
Corrected scaling function for non RGB images.
Changed Files: htroot/ViewImage.java, source/net/yacy/peers/graphics/EncodedImage.java
Fri Nov 20 09:42:24 CET 2015
by luc
Refactoring : extracted write InputStream method.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Fri Nov 20 09:32:30 CET 2015
by luc
Configuration projet eclipse : ajout nature et validation javascript
Changed Files: .project
Fri Nov 20 09:29:02 CET 2015
by luc
Correction erreur de compilation.
Changed Files: htroot/proxymsg/urlproxyheader.java
Fri Nov 20 01:49:56 CET 2015
by reger
start using a template for urlproxy header
It is included as iframe  /proxmsg/urlproxyheader.html
to allow full servlet functionallity and flexibility to display some
index/meta data in future.
Changed Files: htroot/proxymsg/urlproxyheader.html, htroot/proxymsg/urlproxyheader.java, source/net/yacy/http/servlets/UrlProxyServlet.java
Wed Nov 18 10:15:06 CET 2015
by luc
Process large or local file images dealing directly with content
InputStream.
Changed Files: htroot/ViewImage.java, source/net/yacy/cora/util/HTTPInputStream.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/repository/LoaderDispatcher.java, test/ViewImagePerfTest.java, test/ViewImageTest.java
Wed Nov 18 10:11:38 CET 2015
by luc
If available, check content length before downloading. Check also
content length is not over Integer.MAX_VALUE.
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Wed Nov 18 10:08:06 CET 2015
by luc
Ensure resource is closed when reading a full file InputStream
Changed Files: source/net/yacy/crawler/retrieval/FileLoader.java, source/net/yacy/crawler/retrieval/SMBLoader.java, source/net/yacy/kelondro/util/FileUtils.java
Tue Nov 17 23:45:29 CET 2015
by reger
have psParser cleanup temp file
Changed Files: source/net/yacy/document/parser/psParser.java
Mon Nov 16 21:37:45 CET 2015
by reger
set tmpfile.deleteOnExit by default,
to make sure files are removed on shutdown.
Changed Files: source/net/yacy/document/parser/psParser.java, source/net/yacy/kelondro/util/FileUtils.java
Mon Nov 16 01:06:20 CET 2015
by reger
Exclude repetitive protocol part in tokenized url 
used as description if none is avail. from parser.
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Sun Nov 15 06:06:37 CET 2015
by reger
harmonize wordsintitle & CollectionSchema.title_words_val calculation,
remove obsolete partial init of wordreference from urimetadata
Changed Files: source/net/yacy/kelondro/data/word/WordReferenceRow.java, source/net/yacy/kelondro/data/word/WordReferenceVars.java, source/net/yacy/search/index/Segment.java
Sun Nov 15 00:39:38 CET 2015
by reger
add link to quick select blacklist 
from title list
Changed Files: htroot/Blacklist_p.html
Sun Nov 15 00:34:22 CET 2015
by reger
add German translation to re-crawl job
Changed Files: locales/de.lng
Sat Nov 14 21:16:31 CET 2015
by reger
upd to httpcore 4.4.4
Changed Files: .classpath, build.xml, lib/httpcore-4.4.4.License, lib/httpcore-4.4.4.jar, nbproject/project.xml
Fri Nov 13 20:10:47 CET 2015
by reger
fix yacysearch.json "totalResults"
element "totalResults" is included twice (at begin & end), 
only the element after performing the search holds number > 0
see http://mantis.tokeek.de/view.php?id=608
Changed Files: htroot/yacysearch.json
Fri Nov 13 01:48:28 CET 2015
by reger
Sort out double keywords (dc_subject) early in parsed documents
- by direct using Set vs. List
- remove not neede String[] getter
Changed Files: htroot/api/getpageinfo.java, htroot/api/getpageinfo_p.java, source/net/yacy/document/Document.java, source/net/yacy/search/Switchboard.java
Thu Nov 12 08:21:37 CET 2015
by luc
Added links to more image test suites.
Changed Files: test/viewImageTest/ViewImageTest.html
Wed Nov 11 00:57:51 CET 2015
by reger
improve locale translator
- skip empty line
- robustness file section detection (space independant)
Changed Files: source/net/yacy/data/Translator.java
Tue Nov 10 20:45:33 CET 2015
by sixcooler
do not store subfield *_coordinate + make all num-fields being docvalues
Changed Files: source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java
Tue Nov 10 20:43:58 CET 2015
by sixcooler
not using 'location' as defaultfacetfield - since we removed it being
default.
Changed Files: source/net/yacy/search/query/QueryParams.java
Tue Nov 10 20:39:46 CET 2015
by sixcooler
simplification / speedup of GenerationMemoryStrategy
Changed Files: source/net/yacy/kelondro/util/GenerationMemoryStrategy.java, source/net/yacy/kelondro/workflow/InstantBusyThread.java
Tue Nov 10 20:32:42 CET 2015
by sixcooler
do not store subfield *_coordinate
Changed Files: defaults/solr/schema.xml
Tue Nov 10 20:32:05 CET 2015
by sixcooler
set startuptype of most solr handlers to lazy
Changed Files: defaults/solr/solrconfig.xml
Tue Nov 10 20:27:17 CET 2015
by sixcooler
fix to not let the AccessTracker-Log grow to much, but have enough data
to monitor.
(+gitignore-correction)
Changed Files: source/net/yacy/search/query/AccessTracker.java
Tue Nov 10 01:29:13 CET 2015
by reger
harmonize document title for archive parsers
Changed Files: source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/gzipParser.java, source/net/yacy/document/parser/sevenzipParser.java, source/net/yacy/document/parser/tarParser.java, source/net/yacy/document/parser/zipParser.java
Mon Nov 09 08:18:32 CET 2015
by Linker Lin
Create .travis.yml
Changed Files: .travis.yml
Sat Nov 07 19:13:18 CET 2015
by reger
update bzip and bzip parser process,
to return one document for the file with combined parser results of the
containing file and registers it with supplied url and mime of the archive.
Changed Files: source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/gzipParser.java
Fri Nov 06 23:58:55 CET 2015
by reger
update zip and tar parser process,
to return one document for the file with combined parser results of the
containing files.
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/document/parser/tarParser.java, source/net/yacy/document/parser/zipParser.java
Wed Nov 04 21:52:02 CET 2015
by reger
optimize order of parsers to try
- start with a parser matching the remote supplied mime
Changed Files: source/net/yacy/document/TextParser.java
Wed Nov 04 02:57:00 CET 2015
by reger
use current tar library for untar files
- remove old source copy
Changed Files: source/net/yacy/utils/tarTools.java
Tue Nov 03 22:14:14 CET 2015
by reger
fix tarParser early exit on looping content
- adjust check of data available according to doc 
- return null on no recognized content (to not exit TextParser next parser try)
- use commons.compress directly
Changed Files: source/net/yacy/document/parser/tarParser.java
Tue Nov 03 03:35:01 CET 2015
by reger
fix bzipParser recognition
- Bzip2Inputstream checks magic byte itself to identify bz2 (leave it in input)
- try to suppy fitting mime for parsing bz2 content
Changed Files: source/net/yacy/document/parser/bzipParser.java
Sat Oct 31 19:44:31 CET 2015
by reger
increase use of pre.defined CATCHALL_QUERY string
Changed Files: source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/QueryGoal.java
Sat Oct 31 19:18:46 CET 2015
by reger
Optimize internal imagequery focus on using content_type to select images
(in favor of url file extension)
Changed Files: source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/schema/CollectionConfiguration.java
Fri Oct 30 16:20:28 CET 2015
by luc
Avoid returning an empty image when target encoding is not supported or
when an error occured while encoding.
Changed Files: htroot/ViewImage.java
Fri Oct 30 16:19:05 CET 2015
by luc
Updated javadocs for warning on target encoding format potential errors.
Changed Files: source/net/yacy/peers/graphics/EncodedImage.java, source/net/yacy/visualization/RasterPlotter.java
Fri Oct 30 05:18:16 CET 2015
by luc
Corrected images alpha channel rendering
Changed Files: htroot/ViewImage.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/peers/graphics/EncodedImage.java
Thu Oct 29 23:24:39 CET 2015
by luc
Made ViewImagePerfTest extend ViewImageTest to ease automated image
render tests
Changed Files: test/ViewImagePerfTest.java, test/ViewImageTest.java
Thu Oct 29 02:24:17 CET 2015
by luc
Corrected encoding extension arg parsing
Changed Files: test/ViewImageTest.java
Mon Oct 26 22:19:20 CET 2015
by reger
upd readme.mediawiki min java version 1.7
Changed Files: readme.mediawiki
Mon Oct 26 21:19:35 CET 2015
by reger
adjust MediaWiki importer geo coordinate calculation
- allow lat/long 0.xxx
- south / west assignment
include test class
Changed Files: source/net/yacy/data/wiki/WikiCode.java, test/net/yacy/data/wiki/WikiCodeTest.java
Sun Oct 25 05:41:25 CET 2015
by reger
fix IndexImportMediawiki_p servlet's refresh header
add url parameter to make sure no parameter are included in refresh url 
which could cause unwanted restart of import job

see http://mantis.tokeek.de/view.php?id=591 comments
Changed Files: htroot/IndexImportMediawiki_p.html
Sun Oct 25 03:06:15 CET 2015
by reger
fix MediawikiImporter for bz2 dump
skip reading bz2 file magicbyte to identify bz2 format as inputstream reset would be required. Common compress reads and checks the magicbytes internally and throws ioexception if wrong, making preread obsolete.
Changed Files: source/net/yacy/document/importer/MediawikiImporter.java
Sun Oct 25 00:26:45 CEST 2015
by reger
fix a system.out to log.fine
in bmpParser
Changed Files: source/net/yacy/document/parser/images/bmpParser.java
Sat Oct 24 22:44:28 CEST 2015
by reger
remove override of dynamicField coordinate_p in solr schema
(coordinate_p is not a mandatory field as such doesn't need to be declared as schema.field)
Changed Files: defaults/solr/schema.xml
Sat Oct 24 19:36:33 CEST 2015
by reger
fix init of peer flags
(remove hiding of ssl flag)
Changed Files: source/net/yacy/peers/Seed.java
Fri Oct 23 15:49:07 CEST 2015
by luc
Created a class to test ViewImage rendering against multiple image
files.
Changed Files: test/ViewImageTest.java
Fri Oct 23 14:12:00 CEST 2015
by luc
Corrected APNG test suite link name.
Changed Files: test/viewImageTest/ViewImageTest.html
Fri Oct 23 13:57:24 CEST 2015
by luc
Detailed javadoc.
Changed Files: test/ViewImagePerfTest.java
Fri Oct 23 12:27:52 CEST 2015
by luc
Filled ViewImageTest.html with all remaining IANA image file formats.
Added some links to test suites and specifications.
Changed Files: test/viewImageTest/ViewImageTest.html, test/viewImageTest/test/ACAD_r2000_sample.dwg, test/viewImageTest/test/ACAD_r2000_sample_original.dxf, test/viewImageTest/test/Animated_PNG_example_bouncing_beach_ball.png, test/viewImageTest/test/Specimens_of_calligraphy_and_natural_history_illustration.djvu, test/viewImageTest/test/stone_wall.vtf, test/viewImageTest/test/svflogo.svf
Thu Oct 22 02:35:58 CEST 2015
by reger
fix unnececary set null of peer flags, causing reread
remove obsolete version flags
Changed Files: source/net/yacy/peers/Seed.java, source/net/yacy/peers/operation/yacyVersion.java
Thu Oct 22 00:36:34 CEST 2015
by luc
Patch to manage render or load errors is still needed after highlight.js
version upgrade.
Updated patch for better behavior consistency between browsers.
Changed Files: htroot/js/highslide/highslide.js
Wed Oct 21 02:49:51 CEST 2015
by luc
- Keep aspect ratio of images rendered directly by browser such as gif
and svg.
- Corrected quadratic rendering of landscape images with height smaller
than maxHeight
Changed Files: htroot/ViewImage.java, htroot/env/base.css, htroot/yacysearchitem.html, htroot/yacysearchitem.java
Wed Oct 21 02:14:04 CEST 2015
by reger
upd javascript img viewerto highslide 4.1.13
Changed Files: htroot/js/highslide/highslide.js
Tue Oct 20 01:17:37 CEST 2015
by luc
Display full size preview using ViewImage Servlet.
Changed Files: htroot/yacysearchitem.html, htroot/yacysearchitem.java
Tue Oct 20 01:15:02 CEST 2015
by luc
Added image preview error management.
Changed Files: htroot/js/highslide/highslide.js
Mon Oct 19 14:11:26 CEST 2015
by luc
Corrected NullPointerException case when ImageIO reader is not found for
image format.
Changed Files: source/net/yacy/document/ImageParser.java
Mon Oct 19 03:47:28 CEST 2015
by reger
remove obsolete yacy.init entry "secureHttps"
not used anywhere
Changed Files: defaults/yacy.init
Mon Oct 19 01:06:51 CEST 2015
by reger
upd to icu4j-56_1
Changed Files: .classpath, build.xml, lib/icu4j-56_1.jar, nbproject/project.xml, pom.xml
Sun Oct 18 06:19:12 CEST 2015
by reger
add a log entry on parsing ajax crawling scheme snapshot 
(prev. commit https://github.com/yacy/yacy_search_server/commit/9252e36aeb3765ba06d4dcf5543ad2e64c70bd4e) 
Changed Files: source/net/yacy/document/parser/htmlParser.java
Fri Oct 16 23:30:51 CEST 2015
by Michael Peter Christen
replaced HashMap with LinkedHashMap to preserve the object order
Changed Files: source/net/yacy/cora/util/JSONObject.java
Fri Oct 16 23:30:04 CEST 2015
by Michael Peter Christen
added log lines
Changed Files: source/net/yacy/document/ImageParser.java
Fri Oct 16 03:05:39 CEST 2015
by reger
init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
and allow injection of recrawl urls before queue is empty
During recrawl the balancer hangs on the very last urls often on hosts with huge delay time,
by allowing injection earlier progress is more balanced. Max number of injected crawl urls by recrawl job is 2 * max loader.
Changed Files: source/net/yacy/crawler/RecrawlBusyThread.java
Thu Oct 15 09:18:24 CEST 2015
by luc
Created a generic ViewImage performance render test.
Changed Files: test/ViewImagePerfTest.java
Wed Oct 14 10:17:09 CEST 2015
by luc
Created a ViewImage rendering performance measurement test.
Changed Files: test/ViewImageJPGPerfTest.java
Wed Oct 14 10:15:00 CEST 2015
by luc
Refactoring : split into sub-functions to make it understanding and
performance measurement easier.
Changed Files: htroot/ViewImage.java
Wed Oct 14 10:13:37 CEST 2015
by luc
Updated table headers and SVG file url for case sensitive OS.
Changed Files: test/viewImageTest/ViewImageTest.html
Tue Oct 13 02:43:18 CEST 2015
by reger
unescape MultiProtocolURL getAttributes() return values.
use getAttributes() to get query parameters as clear text (w/o url encoding)
use getSearchpartMap() to get in internal format (url encoded)

fix for http://mantis.tokeek.de/view.php?id=606
Changed Files: htroot/Table_API_p.html, source/net/yacy/cora/document/id/MultiProtocolURL.java
Sun Oct 11 01:23:52 CEST 2015
by reger
refactor special handling (static override) of SUPPORTED_EXTENSIONS/MIME_TYPES
not used for genericImageParser

Changed Files: source/net/yacy/document/parser/images/genericImageParser.java
Sat Oct 10 23:49:58 CEST 2015
by reger
add links with image extension not automatically to image links.
With the wide spread use e.g. of Wikimedia the url file extension of links with image extension often point to html.
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/TransformerWriter.java
Tue Oct 06 20:48:09 CEST 2015
by luc
Added ico and bmp sample pictures
Changed Files: test/viewImageTest/ViewImageTest.html, test/viewImageTest/test/favicon.ico, test/viewImageTest/test/sails.bmp
Tue Oct 06 09:53:09 CEST 2015
by luc
Added JPEG 2000 and FITS samples
Changed Files: test/viewImageTest/test/WFPC2u5780205r_c0fx.fits, test/viewImageTest/test/relax.jp2
Tue Oct 06 09:51:47 CEST 2015
by luc
Added image formats and informations for each format.
Changed Files: test/viewImageTest/ViewImageTest.html
Tue Oct 06 04:13:04 CEST 2015
by reger
handle image preview for url w empty file extension
fix of commit 688f7b2a5c194124dc2a253be45a1af1b814bef2
Changed Files: htroot/yacysearchitem.java
Mon Oct 05 01:58:31 CEST 2015
by reger
check jpeg file signature in genericImageParser
to fail early without further object allocation if source is not a jpeg.
Changed Files: source/net/yacy/document/parser/images/genericImageParser.java
Sun Oct 04 05:43:40 CEST 2015
by reger
use recrawljob w/o sort results by date
This is a workaround for existing index (not fully reindexed) since intro of schema with docvalues
to prevent solr exception causing recrawljob to fail with
org.apache.solr.core.SolrCore java.lang.IllegalStateException: unexpected docvalues type NONE for field 'load_date_dt' (expected=NUMERIC). Use UninvertingReader or index with docvalues.
Changed Files: source/net/yacy/crawler/RecrawlBusyThread.java
Sun Oct 04 05:43:16 CEST 2015
by Michael Peter Christen
removed unused import
Changed Files: htroot/yacysearchtrailer.java
Sat Oct 03 21:43:41 CEST 2015
by reger
upd to poi-3.13
Changed Files: .classpath, build.xml, lib/poi-3.13-20150929.jar, lib/poi-3.13.License, lib/poi-scratchpad-3.13-20150929.jar, nbproject/project.xml, pom.xml
Fri Oct 02 12:41:30 CEST 2015
by luc
Created a html test page to check ViewImage rendering with different
file formats.
Changed Files: test/viewImageTest/ViewImageTest.html, test/viewImageTest/test/JPEG_example_JPG_RIP_100.jpg, test/viewImageTest/test/PNG_transparency_demonstration_1.png, test/viewImageTest/test/Rotating_earth_(large).gif, test/viewImageTest/test/SVG_Logo.svg, test/viewImageTest/test/Sunflower_as_gif_websafe.gif, test/viewImageTest/test/marbles.tif, test/viewImageTest/test/sample.cgm
Fri Oct 02 01:48:48 CEST 2015
by reger
allow/display svg images in image results previews
svg is not supported by awt but by most browser. Image content is delivered as received (without size adjustment)
Changed Files: htroot/ViewImage.java, htroot/yacysearchitem.java, source/net/yacy/cora/document/analysis/Classification.java
Thu Oct 01 23:11:58 CEST 2015
by reger
remove some unused var allocation in parser
Changed Files: source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/parser/images/metadataImageParser.java, source/net/yacy/document/parser/rssParser.java, source/net/yacy/document/parser/sitemapParser.java, source/net/yacy/document/parser/swfParser.java
Thu Oct 01 13:21:28 CEST 2015
by Michael Peter Christen
follow-up to latest commit: also flush the search cache if all crawls
had been terminated.
Changed Files: source/net/yacy/search/Switchboard.java
Thu Oct 01 13:18:44 CEST 2015
by Michael Peter Christen
every time a crawl is started, the user expects a different search
result behaviour. This requires that the search cache is flushed for
each crawl start. TODO: this should also be done if a crawl is
terminated.
Changed Files: htroot/Crawler_p.java
Thu Oct 01 13:09:33 CEST 2015
by Michael Peter Christen
in case that the include_string contains several entries including
1-char tokens and also more-than-1-char tokens, then remove the 1-char
tokens to prevent that we are to strict. This will make it possible to
be a bit more fuzzy in the search where it is appropriate.
Changed Files: source/net/yacy/search/query/QueryGoal.java
Thu Oct 01 13:03:22 CEST 2015
by Michael Peter Christen
add also 1-character tokens to the token list because that could be also
searched for. A full-string search for a filename may fail if those
1-char tokens are omitted
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Tue Sep 29 22:57:33 CEST 2015
by reger
add a end condition to svgParser for wrong content
(if parser choosen just by file extension)
Changed Files: source/net/yacy/document/parser/images/svgParser.java
Sun Sep 27 03:24:28 CEST 2015
by reger
remove double caching of inputstream in ViewImage
Changed Files: htroot/ViewImage.java
Sun Sep 27 00:17:42 CEST 2015
by reger
fix old/obsolete solr dependency to stax
delete obsolete jar
Changed Files: .classpath, build.xml, nbproject/project.xml, pom.xml
Sat Sep 26 19:58:15 CEST 2015
by reger
Add report profile with OWASP Dependency-Check to maven pom
Changed Files: pom.xml
Sat Sep 26 17:30:34 CEST 2015
by reger
remove rdfParser from init (current function identical with genericParser)
Changed Files: source/net/yacy/document/TextParser.java
Sat Sep 26 17:27:33 CEST 2015
by reger
add svgParser to parse metadate from svg images
Reads document level included title and description and skips the graphic content to save bandwidth.
svg metadata element is not interpreted
- remove rdfParser from init (current function identical with genericParser)
Changed Files: source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/images/svgParser.java
Sat Sep 26 15:42:23 CEST 2015
by reger
optimize parseInt for <img> tag attribute parsing
Performance better as using Numberformat.parse or parseInt(substring())
Changed Files: source/net/yacy/cora/util/NumberTools.java, source/net/yacy/document/parser/html/ContentScraper.java, test/net/yacy/cora/util/NumberToolsTest.java
Thu Sep 24 01:58:19 CEST 2015
by reger
check for loading error (includs unsupported formats)
to prevent blank thumbnail display in image search because of not handled source which don't load on click.
Now the cross icon indicates the problem (inlcuding not supported format)
Changed Files: source/net/yacy/document/ImageParser.java
Wed Sep 23 21:01:51 CEST 2015
by luc
Correction for mantis 535: inurl: parameter doesn't work on URLs with
upper-case letters
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Wed Sep 23 00:13:10 CEST 2015
by reger
harmonize/correct assignment to Ymarkmeta.mime
replace use of deprecated
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java, source/net/yacy/data/ymark/YMarkMetadata.java
Tue Sep 22 11:56:17 CEST 2015
by Michael Peter Christen
Fix for index entries which have id's not computed as hash from the url.
This makes it possible to operate with outside-computed url hashes in
enterprise environments not using the build-in crawler from YaCy.
Changed Files: source/net/yacy/cora/document/id/DigestURL.java, source/net/yacy/peers/Protocol.java
Tue Sep 22 03:52:15 CEST 2015
by reger
remove unused check for known fileextension in searchtrailer
(check is done on add to filetype-nav)
Changed Files: htroot/yacysearchtrailer.java
Tue Sep 22 00:12:31 CEST 2015
by reger
optionally include mime in p2p url exchange string
if doctype decodes to ambiguous mime and default conversion is not equal to original
 
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Sun Sep 20 23:28:42 CEST 2015
by reger
add Portuguese month names to date recognition
Changed Files: source/net/yacy/document/DateDetection.java
Sat Sep 19 05:30:55 CEST 2015
by reger
fix html parser taking <style> content as text.
Noticed some result description contain css content from style tag.
Added <style> to tag list to scrape it's content not as text
+ test case included
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, test/net/yacy/document/parser/htmlParserTest.java
Fri Sep 18 02:25:44 CEST 2015
by Michael Peter Christen
patch for a bug inside of solr since solr 5.0 when using a boost
function with a numeric date field:
"unexpected docvalues type NUMERIC for field 'last_modified' (expected
one of [SORTED, SORTED_SET]). Use UninvertingReader or index with
docvalues."
This is a well-known bug inside solr which prevents that now the 'sort
by date' in the YaCy search interface can be used. Without this patch no
results at all is displayed (since the exception prevents that). Now
there is at least a result but it is not ordered properly.
Changed Files: source/net/yacy/cora/federate/solr/Ranking.java
Tue Sep 15 02:11:30 CEST 2015
by reger
limit css scrolling to result/content window x
from pull request #10
Changed Files: htroot/env/bootstrap-base.css
Tue Sep 15 02:09:17 CEST 2015
by Burkhard
Merge pull request #10 from Raegdan/raegdan-css-layout-fix

Fixed CSS scrolling
Changed Files: htroot/env/bootstrap-base.css
Sun Sep 13 20:23:15 CEST 2015
by reger
Hack to prevent Solr issue on partial update on a document containing multivalued date field
(regardless if these fields part of update).
Switch partial update option off in postprocessing if schema contains *_dts (multivalued date field).
see http://mantis.tokeek.de/view.php?id=601
Changed Files: source/net/yacy/search/Switchboard.java
Sun Sep 13 20:19:50 CEST 2015
by reger
adapt SolrServerConnector.add to handle error on partial update input document.
In case of error we deleted the original document and added the new doc to the index.
This is not valid for partial update documents (which contain only a subset of the fields).
Remove the "delete" error handling step.
Changed Files: source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java
Sun Sep 13 06:02:07 CEST 2015
by reger
add test case for partial update - to discover effect on YaCy for update of documents with multivalued date fields (like dates_in_content_dts) 
current result: loss of fields/information in index document, see EmbeddedSolrConnectorTest.testUdate_withMultivaluedDateField()
Changed Files: test/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnectorTest.java
Sat Sep 12 23:06:13 CEST 2015
by reger
on reindex delete index document with invalid url
if discovered
Changed Files: source/net/yacy/crawler/RecrawlBusyThread.java
Sat Sep 12 22:00:40 CEST 2015
by reger
use a parsed date in Document.toString
Changed Files: source/net/yacy/document/Document.java
Fri Sep 11 17:23:59 CEST 2015
by luccioman
Returned again to main repository location : does anyone want to
consider mantis 597 ?  (http://mantis.tokeek.de/view.php?id=597)
Changed Files: htroot/env/style.java
Mon Sep 07 02:36:22 CEST 2015
by luccioman
Merge from main repository
Changed Files: htroot/style.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Mon Sep 07 02:36:22 CEST 2015
by reger
improve filtering by filetype navigator.
The used url-filter for filetype doesn't require ".ext" resulting in too many matches,
add a sort-out filter for RWI results.
Changed Files: source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Sun Sep 06 22:19:05 CEST 2015
by reger
prevent metadata records in index w/o valid url
by throwing MalformedURL exception on URIMetadataNode creation
Changed Files: source/net/yacy/cora/federate/AbstractFederateSearchConnector.java, source/net/yacy/cora/federate/SolrFederateSearchConnector.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java
Sun Sep 06 04:28:27 CEST 2015
by reger
extract modification date from vCard (vcfParser)
Changed Files: source/net/yacy/document/parser/vcfParser.java
Sun Sep 06 00:04:54 CEST 2015
by reger
extract lastmodified from openoffice doc
set lastmod date in office document parsers
Changed Files: source/net/yacy/document/parser/docParser.java, source/net/yacy/document/parser/odtParser.java, source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/document/parser/pptParser.java, source/net/yacy/document/parser/xml/ODMetaHandler.java
Sat Sep 05 14:07:23 CEST 2015
by Michael Peter Christen
in case that a site crawl is started for urls with file:// path, the
host filter does not work because there is no host given in such urls.
In that case, patch the filter to be a sub-path filter.
Changed Files: htroot/Crawler_p.java
Sat Sep 05 01:57:30 CEST 2015
by reger
fix exception throw after sendError in DefaultServlet
- reduce debug exception logs in crawler
Changed Files: source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Fri Sep 04 17:05:06 CEST 2015
by Michael Peter Christen
Merge pull request #12 from luccioman/master

Updated french locale and added new translator utils
Changed Files: htroot/Steering.html, htroot/yacysearchitem.html, locales/fr.lng, source/net/yacy/data/Translator.java, source/net/yacy/kelondro/util/FileUtils.java, source/net/yacy/utils/translation/ListNonTranslatedFiles.java, source/net/yacy/utils/translation/SourceFileFilter.java, source/net/yacy/utils/translation/TranslateAll.java, source/net/yacy/utils/translation/TranslatorUtil.java
Fri Sep 04 13:52:03 CEST 2015
by luccioman
Return to mai repository version
Changed Files: htroot/env/style.java
Fri Sep 04 13:44:44 CEST 2015
by luccioman
Added utils to help translation without launching full YaCy application
:
- translate all source files with a locale
- list all non translated files with a locale
Changed Files: source/net/yacy/utils/translation/ListNonTranslatedFiles.java, source/net/yacy/utils/translation/TranslateAll.java, source/net/yacy/utils/translation/TranslatorUtil.java
Fri Sep 04 13:42:57 CEST 2015
by luccioman
Added a function to list files recursively.
Changed Files: source/net/yacy/kelondro/util/FileUtils.java
Fri Sep 04 13:42:10 CEST 2015
by luccioman
Translator refactoring : 
- deleted useless new StringBuilder allocation
- use of a new reusable FileNameFilter
- added javadoc
Changed Files: source/net/yacy/data/Translator.java, source/net/yacy/utils/translation/SourceFileFilter.java
Thu Sep 03 23:36:57 CEST 2015
by reger
fix missing license in image search
see http://mantis.tokeek.de/view.php?id=522
Changed Files: htroot/yacysearchitem.java
Thu Sep 03 09:02:03 CEST 2015
by luccioman
Updated french translations for yacysearhitem.html,
yacysearchtrailer.html and Steering.html files.
Corrected various labels.
Changed Files: locales/fr.lng
Thu Sep 03 08:59:17 CEST 2015
by luccioman
Corrected br markup
Changed Files: htroot/Steering.html
Thu Sep 03 08:58:14 CEST 2015
by luccioman
Corrected bookmark link title
Changed Files: htroot/yacysearchitem.html
Thu Sep 03 00:59:14 CEST 2015
by reger
fix missing onclick in ConfigPortal
to enable checkbox
Changed Files: htroot/ConfigPortal.html
Wed Sep 02 19:10:39 CEST 2015
by sixcooler
ignore /DATA (Eclipse Mars)
Changed Files: .gitignore
Tue Sep 01 23:22:48 CEST 2015
by reger
apply same size constrain on result image from doc
as for linked images
see https://github.com/yacy/yacy_search_server/commit/19f1308bf09172d2be66c58289d52ba2b2c0cf9d
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Sep 01 21:47:25 CEST 2015
by reger
enable Solr schema dynamicField _p (type=location) for YaCy coordinate_p field
Changed Files: defaults/solr/schema.xml
Mon Aug 31 23:28:03 CEST 2015
by reger
complete TODO: getFileExtension handle dot in query part
+ testcase
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Mon Aug 31 19:57:57 CEST 2015
by sixcooler
French Translation update by Luc:
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5671
Changed Files: locales/fr.lng
Mon Aug 31 01:58:36 CEST 2015
by reger
start recording/indexing pixel size for image document
as for linked images
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sun Aug 30 23:02:19 CEST 2015
by reger
check mime prior to ext for metadata modification for images
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sun Aug 30 02:19:52 CEST 2015
by reger
enforce th result images limit to > 16x16px
for linked images
http://mantis.tokeek.de/view.php?id=594
Changed Files: source/net/yacy/search/query/SearchEvent.java
Thu Aug 27 09:25:11 CEST 2015
by luccioman
Updated french translation for index.html, yacysearch.html and
simpleheader.template. Correcte special characters to use HTML entities
instead.
Changed Files: defaults/yacy.network.freeworld.unit, locales/fr.lng
Wed Aug 26 23:58:08 CEST 2015
by reger
remove exired domain titan.deep-one.in from bootstrap.seedlist
Changed Files: defaults/yacy.network.freeworld.unit
Wed Aug 26 13:57:00 CEST 2015
by luccioman
Updated translation of index.html, yacysearch.html and
simpleheader.template, corrected some special characters not written as
HTML entities.
Changed Files: htroot/style.java, locales/fr.lng
Tue Aug 25 23:26:17 CEST 2015
by reger
fix NPE on .yacyh result url of disconnected peer
(cleanup yacyshare remaining)
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Tue Aug 25 02:19:00 CEST 2015
by reger
log missing seed.port
in favour of exception to prevent repeating throws
Changed Files: source/net/yacy/peers/Seed.java
Tue Aug 25 01:16:41 CEST 2015
by reger
fix: Preserve protocol in url proxy 
to connect to http/https. Display warning if https target is viewed over http
Changed Files: source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/http/servlets/YaCyProxyServlet.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Wed Aug 19 22:46:48 CEST 2015
by reger
upd to jsoup-1.8.3
Changed Files: .classpath, build.xml, lib/jsoup-1.8.3.jar, nbproject/project.xml, pom.xml
Mon Aug 10 20:53:20 CEST 2015
by sixcooler
added / corrected charste to be 1.7 compatible.
@Orbiter: please check is this is ok for you
Changed Files: source/net/yacy/document/ProbabilisticClassifier.java
Sun Aug 09 21:01:30 CEST 2015
by reger
exclude more default search fields from  text copy to text_t
for metadata index documents
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sat Aug 08 18:35:49 CEST 2015
by reger
remove obsolete interface SearchAccumulator
and unused SRURSSConnector Thread inheritance
Changed Files: source/net/yacy/cora/federate/opensearch/SRURSSConnector.java
Mon Aug 03 05:17:22 CEST 2015
by Michael Peter Christen
enhanced logging
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/util/ConcurrentLog.java
Mon Aug 03 00:49:24 CEST 2015
by reger
! finish running crawls before applying !
Allow crawl urls up to 2048 character 
fix for http://mantis.tokeek.de/view.php?id=575
Changed Files: source/net/yacy/crawler/retrieval/Request.java
Sun Aug 02 22:56:14 CEST 2015
by reger
use some more declared HeaderFramework constants
Changed Files: source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/protocol/http/GzipCompressingEntity.java, source/net/yacy/cora/protocol/http/GzipRequestInterceptor.java, source/net/yacy/cora/protocol/http/GzipResponseInterceptor.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Sun Aug 02 21:36:44 CEST 2015
by reger
add missing ; in base.css
Changed Files: htroot/yacy/ui/css/base.css
Sun Aug 02 03:39:58 CEST 2015
by reger
pom: have Maven dependency management decide on transitive Lucene dependencies
Changed Files: pom.xml
Sun Aug 02 00:53:49 CEST 2015
by reger
replace deprecated myPublicLocalIP() in AbstractRemoteHandler
Changed Files: source/net/yacy/http/AbstractRemoteHandler.java
Sun Aug 02 00:20:14 CEST 2015
by reger
remove unused Transmission hit counter
Changed Files: source/net/yacy/peers/Transmission.java
Sat Aug 01 23:54:26 CEST 2015
by reger
use more absolute path for config file opening
as suggested in pull request 5 (https://github.com/yacy/yacy_search_server/pull/5)
Changed Files: source/net/yacy/search/Switchboard.java
Thu Jul 30 14:10:31 CEST 2015
by Michael Peter Christen
added bayes filter from Philipp Nolte, originally taken from
https://github.com/ptnplanet/Java-Naive-Bayes-Classifier
and modified inside the loklak.org project. After optimization in loklak
it was inserted into the net.yacy.cora.bayes package. It shall be used
to create custom search navigation filters.

The original copyright notice was copied from the README.md from
https://github.com/ptnplanet/Java-Naive-Bayes-Classifier/blob/master/README.md
The original package domain was
de.daslaboratorium.machinelearning.classifier
Changed Files: source/net/yacy/cora/bayes/BayesClassifier.java, source/net/yacy/cora/bayes/Classification.java, source/net/yacy/cora/bayes/Classifier.java
Thu Jul 30 13:39:10 CEST 2015
by Michael Peter Christen
using latest enhanced (un/)gzip methods from loklak for yacy
Changed Files: source/net/yacy/utils/gzip.java
Thu Jul 30 03:21:40 CEST 2015
by Michael Peter Christen
added export option to export the fulltext of the search index text only
Changed Files: htroot/IndexExport_p.html, htroot/IndexExport_p.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/search/index/Fulltext.java
Mon Jul 27 15:16:08 CEST 2015
by Michael Peter Christen
try a healing of the cache if the index file is corrupted
Changed Files: source/net/yacy/crawler/data/Cache.java
Mon Jul 27 15:03:13 CEST 2015
by Michael Peter Christen
added log lines for query performance profiling
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Mon Jul 27 00:57:19 CEST 2015
by reger
upd to SLF4J-1.7.12
Changed Files: .classpath, build.xml, lib/jcl-over-slf4j-1.7.12.jar, lib/log4j-over-slf4j-1.7.12.jar, lib/slf4j-api-1.7.12.jar, lib/slf4j-jdk14-1.7.12.jar, nbproject/project.xml
Sun Jul 26 00:53:40 CEST 2015
by reger
upd to httpclient-4.5 and httpmime-4.5
Changed Files: .classpath, build.xml, lib/httpclient-4.5.License, lib/httpclient-4.5.jar, lib/httpmime-4.5.License, lib/httpmime-4.5.jar, nbproject/project.xml, pom.xml
Sat Jul 25 00:50:41 CEST 2015
by reger
upd to icu4j-55.1.jar
Changed Files: .classpath, build.xml, lib/icu4j-55_1.jar, nbproject/project.xml, pom.xml
Tue Jul 21 22:31:34 CEST 2015
by reger
upd to jsch-0.1.53.jar
Changed Files: .classpath, build.xml, lib/jsch-0.1.53.License, lib/jsch-0.1.53.jar, nbproject/project.xml, pom.xml
Tue Jul 21 07:21:10 CEST 2015
by Kirill Fomchenko
Fixed CSS scrolling

When the sidebar on search page becomes scrollable, the scrollbar shrinks the sidebar and makes the search results weirdly scrollable on X axis by several pixels. Now the sidebar always have a scrollbar, and results are never X-scrollable.
Changed Files: htroot/env/bootstrap-base.css
Mon Jul 20 03:45:23 CEST 2015
by reger
upd to lib/weupnp-0.1.3.jar
Changed Files: .classpath, build.xml, lib/weupnp-0.1.3.jar, nbproject/project.xml, pom.xml
Thu Jul 16 23:42:41 CEST 2015
by Michael Peter Christen
added jsonp to suggest servlet
Changed Files: htroot/suggest.java, htroot/suggest.json
Wed Jul 15 01:04:59 CEST 2015
by reger
upd NB classpath
Changed Files: nbproject/project.xml
Fri Jul 10 16:47:19 CEST 2015
by Michael Peter Christen
remove old vocabularies and synonyms before adding new
Changed Files: .classpath, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java
Thu Jul 09 16:25:11 CEST 2015
by Michael Peter Christen
added another default network / commented out
Changed Files: defaults/yacy.init
Wed Jul 08 17:36:37 CEST 2015
by Michael Peter Christen
added msg (text emails) format; should be handled by html parser.
Changed Files: source/net/yacy/document/parser/htmlParser.java
Wed Jul 08 03:02:10 CEST 2015
by reger
fix one implicit Integer/Long type conversion
-> causes Java 1.8 compile error
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sat Jul 04 22:49:01 CEST 2015
by reger
add CommonPattern for multiple spaces 
to eliminate empty split words on following spaces
Changed Files: htroot/ViewFile.java, htroot/Vocabulary_p.java, htroot/yacysearch_location.java, pom.xml, source/net/yacy/cora/document/feed/RSSMessage.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/cora/util/CommonPattern.java, source/net/yacy/data/DidYouMean.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Jul 02 00:23:50 CEST 2015
by Michael Peter Christen
added enrichment of synonyms and vocabularies for imported documents
during surrogate reading: those attributes from the dump are removed
during the import process and replaced by new detected attributes
according to the setting of the YaCy peer.
This may cause that all such attributes are removed if the importing
peer has no synonyms and/or no vocabularies defined.
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/document/Tokenizer.java, source/net/yacy/document/VocabularyScraper.java, source/net/yacy/document/WordTokenizer.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java
Wed Jul 01 18:28:18 CEST 2015
by Michael Peter Christen
refactoring: separated condenser and tokenizer
Changed Files: source/net/yacy/document/Condenser.java, source/net/yacy/document/Tokenizer.java
Wed Jul 01 00:58:23 CEST 2015
by reger
Rem depreciated AdminHandlers in solrconfig.xml
avoid warning log
W  org.apache.solr.handler.admin.AdminHandlers <requestHandler name="/admin/"  class="solr.admin.AdminHandlers" /> is deprecated . It is not required anymore
Changed Files: defaults/solr/solrconfig.xml
Tue Jun 30 11:12:36 CEST 2015
by Michael Peter Christen
fix for non-authorized view of IndexBrowser: show only the number of
non-failure documents
Changed Files: htroot/HostBrowser.java
Mon Jun 29 12:28:34 CEST 2015
by Michael Peter Christen
enhanced surrogate import process speed (dramatically!)
Changed Files: source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/search/Switchboard.java
Mon Jun 29 02:02:01 CEST 2015
by Michael Peter Christen
fix for
- bad regex computation for crawl start from file (limitation on domain
did not work)
- servlet error when starting crawl from a large list of urls
Changed Files: htroot/Crawler_p.java, source/net/yacy/crawler/data/CrawlProfile.java
Wed Jun 24 13:02:12 CEST 2015
by Michael Peter Christen
suppress access to solr when doing search suggestions in case that the
index has more than two million documents. This protects the index from
beeing flooded with search requests that cannot be resolved before the
real search query has to be computet.
Changed Files: htroot/suggest.java, htroot/yacysearch.java, source/net/yacy/data/DidYouMean.java
Tue Jun 23 23:41:43 CEST 2015
by Michael Peter Christen
remove redundant code
Changed Files: htroot/Crawler_p.java
Sun Jun 14 22:56:26 CEST 2015
by sixcooler
Next Try for a fix for upload-connection staying in blocked state.
This was caused by reading via GZIP from close-wait connection an caused
high cpu- and system-loads.
Instat of implementing handling of the RedListener now I found a
timelimeted 'get' "realy" solving this problem.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Wed Jun 10 02:35:37 CEST 2015
by reger
Resourceobserver log warning - deleting releases files - only on actual deletes
instead of entering routine
Changed Files: source/net/yacy/peers/operation/yacyRelease.java, source/net/yacy/search/ResourceObserver.java
Tue Jun 09 21:26:10 CEST 2015
by sixcooler
Fix for upload-connection staying in blocked state.
This was caused by reading via GZIP from close-wait connection an caused
high cpu- and system-loads.
Solved by implementing handling of the RedListener.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Mon Jun 08 03:17:12 CEST 2015
by reger
add log entry on release file delete by ResourceObserver
Changed Files: source/net/yacy/search/ResourceObserver.java
Mon Jun 08 02:52:13 CEST 2015
by reger
implement deleteOldDownloads in RexourceObserver on low diskspace
- direct assign sb.observer (skip redundant InitThread)
Changed Files: source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/Switchboard.java
Sun Jun 07 20:37:37 CEST 2015
by Michael Peter Christen
added link to Snapshots in search results if the snapshot exists and
option is set in ConfigSearchPage_p
(this is a stub: we also need a visualization of pdf files!)
Changed Files: defaults/yacy.init, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/yacysearchitem.html, htroot/yacysearchitem.java, source/net/yacy/crawler/data/Transactions.java
Sat Jun 06 18:45:39 CEST 2015
by reger
enhance recrawl job
- allow to modify the query to select documents to  process (after job has started)
- allow to include failed urls (httpstatus <> 200)
Changed Files: htroot/IndexReIndexMonitor_p.html, htroot/IndexReIndexMonitor_p.java, source/net/yacy/crawler/RecrawlBusyThread.java
Fri Jun 05 07:22:35 CEST 2015
by Michael Peter Christen
servlet for latest commit
Changed Files: htroot/IndexExport_p.html, htroot/IndexExport_p.java
Fri Jun 05 03:36:57 CEST 2015
by reger
upd to poi-3.12.jar
Changed Files: .classpath, build.xml, lib/poi-3.12-20150511.jar, lib/poi-3.12.License, lib/poi-scratchpad-3.12-20150511.jar, nbproject/project.xml, pom.xml
Fri Jun 05 00:51:00 CEST 2015
by reger
remove augmented parsing activation from frontend
experimental implementation not used and based on error prone experimental rdfaparser 
Changed Files: htroot/env/templates/submenuSemantic.template, source/net/yacy/document/TextParser.java
Fri Jun 05 00:15:16 CEST 2015
by reger
remove RDFa parser activation from frontend
reason: experimental implementatin of RDFa parser not executed (limited to special urls) but may cause error on normal html parsing due to a inputstream.reset
Changed Files: htroot/AugmentedParsing_p.html, htroot/AugmentedParsing_p.java, source/net/yacy/document/TextParser.java
Thu Jun 04 23:03:46 CEST 2015
by Michael Peter Christen
removed the new index export method from the IndexControlURLs_p.html
servlet and moved it to a new /IndexExport_p.html servlet. This servlet
is now more prominent linked in the main menu under Production -> Index
Export/Import
Changed Files: htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, htroot/env/templates/header.template, htroot/env/templates/submenuIndexImport.template, locales/de.lng, locales/fr.lng, locales/ru.lng
Thu Jun 04 22:44:46 CEST 2015
by reger
Merge origin/master
Changed Files: skins/27c3.css, skins/28c3.css, skins/dark-blue.css, skins/dark.css, skins/phosphor.css
Thu Jun 04 22:44:01 CEST 2015
by reger
remove obsolete searchfl work table
was used to register urls with not complete words in snippet but is never accessed
Changed Files: htroot/ConfigHTCache_p.html, htroot/ConfigHTCache_p.java, htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, source/net/yacy/data/WorkTables.java, source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/Switchboard.java
Thu Jun 04 22:15:38 CEST 2015
by sixcooler
correct the dark themes to show also a dark navbar on searchresults
Changed Files: skins/27c3.css, skins/28c3.css, skins/dark-blue.css, skins/dark.css, skins/phosphor.css
Mon Jun 01 01:24:33 CEST 2015
by Michael Peter Christen
gzip compression will perform more efficient and with better compression
level
Changed Files: source/net/yacy/cora/protocol/http/GzipCompressingEntity.java, source/net/yacy/kelondro/blob/Compressor.java, source/net/yacy/kelondro/index/RowHandleMap.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/utils/gzip.java
Sat May 30 19:02:54 CEST 2015
by Michael Peter Christen
full solr xml exports will now be automatically compressed during
export. That makes it possible to export a solr xml dump even if disc
space is low.
Changed Files: source/net/yacy/search/index/Fulltext.java
Sat May 30 17:54:02 CEST 2015
by Michael Peter Christen
wrap HeaReader close() in a catch Throwable block to prevent that an
excpetion during close blocks the whole shotdown process
Changed Files: source/net/yacy/kelondro/blob/HeapReader.java
Sat May 30 13:19:59 CEST 2015
by Michael Peter Christen
added surrogate import process for exported solr dumps.
Just throw your solr dump file into DATA/SURROGATES/in/ and it will be
imported!
Changed Files: htroot/api/ymarks/import_ymark.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java
Sat May 30 06:57:15 CEST 2015
by Michael Peter Christen
prevent disc usage when showing tray animation
Changed Files: source/net/yacy/gui/Tray.java
Sat May 30 06:12:08 CEST 2015
by Michael Peter Christen
re-licensing some of my old visualization classes under LGPL 2.1
Changed Files: source/net/yacy/visualization/ChartPlotter.java, source/net/yacy/visualization/CircleTool.java, source/net/yacy/visualization/FontGenerator3Pixel.java, source/net/yacy/visualization/FontGenerator5Pixel.java, source/net/yacy/visualization/GraphPlotter.java, source/net/yacy/visualization/HexGridPlotter.java, source/net/yacy/visualization/PrintTool.java, source/net/yacy/visualization/RasterPlotter.java
Sat May 30 06:01:52 CEST 2015
by Michael Peter Christen
adding a 3-pixel font generator made some time ago..
Changed Files: source/net/yacy/visualization/FontGenerator3Pixel.java
Thu May 28 16:07:40 CEST 2015
by Michael Peter Christen
All entities of crawl profiles are now editable in the crawl profile
editor.
Changed Files: htroot/CrawlProfileEditor_p.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/search/Switchboard.java
Wed May 27 02:31:13 CEST 2015
by reger
- Image search expand box, adjust javascript hs padtominsize parameter, to make sure expand box doesn't shrink on small images
- asure ImageResult.imagetext has value for the link text (use filename if no alt text given)
Changed Files: htroot/js/highslide/highslide.js, source/net/yacy/search/query/SearchEvent.java
Tue May 26 23:57:06 CEST 2015
by reger
remove hard throw exception in makeResultEntry
remove not used "share." peername.yacy url rewrite
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Tue May 26 23:54:04 CEST 2015
by reger
use available mime (instead null) on imageresult from metadatanode
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue May 26 04:26:26 CEST 2015
by reger
revert deletion of BinSearch
(accident)
Changed Files: source/net/yacy/kelondro/index/BinSearch.java
Tue May 26 04:15:00 CEST 2015
by reger
Eleminate duplication of values for search ResultEntry 
by instatiation from URIMetadataNode, by eleminating differentiation of ResultEntry/URIMetadataNode.
- moved remaining ResultEntry functionallity to URIMetadataNode
   - for 1:1 functionallity added a function makeResultEntry() 
- removed ResultEntry 
- refactored related code

Main difference is after makeResultEntry the text_t content is removed and alternative title/url strings for display are calculated.


Main difference left is, that 
Changed Files: htroot/yacy/search.java, htroot/yacysearchitem.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/search/query/SearchEvent.java
Mon May 25 21:51:32 CEST 2015
by reger
fix compiler notification of missing serialID 
from last commit
Changed Files: source/net/yacy/search/snippet/ResultEntry.java
Mon May 25 21:28:48 CEST 2015
by reger
refactor ResultEntry to be based on MetadataNode/SolrDocument
to share/reuse common access routines
Changed Files: htroot/yacysearchitem.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/snippet/ResultEntry.java
Mon May 25 19:46:26 CEST 2015
by reger
Implement sharing of ioDispatcher for term & citation index
as proposed in ioDispatcher description
Changed Files: source/net/yacy/kelondro/rwi/IODispatcher.java, source/net/yacy/kelondro/rwi/IndexCell.java, source/net/yacy/search/index/Segment.java
Mon May 25 00:08:38 CEST 2015
by reger
use doctype() in ViewFile to choose display routines
in preference of getfileExtension()
Changed Files: htroot/ViewFile.java
Sun May 24 21:48:58 CEST 2015
by reger
On imageSearch prefere mime to sort out none-image documents
Generalize the hack to prevent urls with just a img extension beeing returned

improving http://mantis.tokeek.de/view.php?id=528
Changed Files: source/net/yacy/search/query/SearchEvent.java
Sun May 24 19:38:04 CEST 2015
by reger
improve MultiprotocolURL.getFileExtension()
prevent string OOB while querypart contains a dot (return just "")
see log snippet in http://mantis.tokeek.de/view.php?id=533
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Sun May 24 18:03:27 CEST 2015
by reger
Increase IODspatcher dumpQueue size to 2 to reduce risk of concurrent emergency dump,
skip concurrent emergency merge
dealing with/see  http://mantis.tokeek.de/view.php?id=566
Changed Files: source/net/yacy/kelondro/rwi/IODispatcher.java, source/net/yacy/kelondro/rwi/IndexCell.java
Sun May 24 01:59:40 CEST 2015
by reger
fix string OoB on getImagelinks with long alttext
in description calculation
Changed Files: source/net/yacy/document/Document.java
Sat May 23 20:31:37 CEST 2015
by reger
Convert content charset for display via CacheResource_p
Cached resource charset encoding might not fit to internal handling (using utf-8),
convert resource to utf-8
see http://mantis.tokeek.de/view.php?id=576
Changed Files: htroot/CacheResource_p.java
Fri May 22 11:22:36 CEST 2015
by Michael Peter Christen
removed a -UNRESOLVED_PATTERN-
Changed Files: htroot/yacysearch.json
Sun May 17 06:21:12 CEST 2015
by reger
Limit extra sleep of BusyThread on LowMemCycle
Changed Files: source/net/yacy/kelondro/workflow/AbstractBusyThread.java
Sun May 17 00:13:00 CEST 2015
by reger
detail optimization of RecrawlThread
Changed Files: source/net/yacy/crawler/RecrawlBusyThread.java
Sat May 16 01:23:08 CEST 2015
by reger
Initial (experimental) implementation of index update/re-crawl job
added to IndexReIndexMonitor_p.html
Selects existing documents from index and feeds it to the crawler.
currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY]
Documents are  added in small chunks (200) to the crawler, only if no other crawl is running.
Changed Files: htroot/IndexReIndexMonitor_p.html, htroot/IndexReIndexMonitor_p.java, source/net/yacy/crawler/RecrawlBusyThread.java
Sat May 16 00:01:54 CEST 2015
by reger
correct log msg text
Changed Files: source/net/yacy/crawler/retrieval/FileLoader.java
Thu May 14 00:03:09 CEST 2015
by reger
fix extract of inboundlinks_protocol_sxt
url counter maybe > 999
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Wed May 13 21:58:43 CEST 2015
by reger
fix early return in addToCrawler
check / handle all supplied urls after error url
Changed Files: source/net/yacy/search/Switchboard.java
Tue May 12 12:06:21 CEST 2015
by Michael Peter Christen
fix for latest commit, see
https://github.com/yacy/yacy_search_server/commit/f810915717579d490259d70610dc4118b7c6e6e9#commitcomment-11145880
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Tue May 12 01:09:10 CEST 2015
by reger
fix NPE on MultiProtocolURL on url with parameter value and '='
in getAttribute
- added test case for it
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Mon May 11 16:30:41 CEST 2015
by Michael Peter Christen
added crawl start from a clone with very, very large url: they are now
encoded as post submit form inside a javascript creation function.
Changed Files: htroot/CrawlStartExpert.java, htroot/Table_API_p.html, htroot/Table_API_p.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/data/WorkTables.java
Mon May 11 14:42:21 CEST 2015
by Michael Peter Christen
enable api calls with very long urls
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/data/WorkTables.java, source/net/yacy/search/index/Fulltext.java
Mon May 11 01:35:12 CEST 2015
by reger
upd library reference of missing jsch-0.1.21 in seeduploadscp.xml
upd to jsch-0.1.52.jar
Changed Files: .classpath, build.xml, lib/jsch-0.1.52.License, lib/jsch-0.1.52.jar, nbproject/project.xml, pom.xml, source/net/yacy/peers/operation/yacySeedUploadScp.xml
Sun May 10 18:52:33 CEST 2015
by reger
add opensearch rss results to dht collection (due to text = snippet)
which is used to differentiate meta from full data
- make sure check for dht is not dependant on number of collection entries
Changed Files: source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/search/index/Fulltext.java
Sun May 10 15:30:21 CEST 2015
by reger
add bookmark.query to edit form
Changed Files: htroot/Bookmarks.html, htroot/Bookmarks.java
Sun May 10 15:29:23 CEST 2015
by reger
persist bookmark timestamp
on setTimeStamp()
Changed Files: source/net/yacy/data/BookmarksDB.java
Sun May 10 03:00:05 CEST 2015
by reger
upd to commons-io-2.4.jar
Changed Files: .classpath, build.xml, lib/commons-io-2.4.License, lib/commons-io-2.4.jar, nbproject/project.xml, pom.xml
Sun May 10 02:29:08 CEST 2015
by reger
update bookmark autosearch description
- add german translation
Changed Files: htroot/Bookmarks.html, locales/de.lng
Fri May 08 15:30:26 CEST 2015
by Michael Peter Christen
added option to re-index exported xml snapshot dumps to
HTCACHE/snapshots by just placing them in the SURROGATES/in path
Changed Files: source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/search/Switchboard.java
Fri May 08 14:01:30 CEST 2015
by Michael Peter Christen
revert of 8a7c68e4c7f6a682e3ef656b423ce1ad76b42caa
keeping surrogates after processing is essential for some users. If the
space they are taking is too high, please set up an automatic deletion
process (like a cronjob).
Changed Files: source/net/yacy/document/importer/OAIPMHImporter.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java
Fri May 08 13:46:27 CEST 2015
by Michael Peter Christen
added must-not-match filter to snapshot generation.
also: fixed some bugs
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java, htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/search/index/Segment.java
Fri May 08 10:38:33 CEST 2015
by Michael Peter Christen
adding a try-catch to link graph processing to prevent that a single
malformed url interrupts the storage process
Changed Files: source/net/yacy/search/index/Segment.java
Sun May 03 02:31:50 CEST 2015
by reger
on bookmaring of search result, remember orig. query in separate bookmark property
(instead of using the description field)
- adjust display and autosearch
- don't overwrite existing bookmark but combine info
Changed Files: htroot/Bookmarks.html, htroot/Bookmarks.java, htroot/api/bookmarks/posts/add_p.java, htroot/yacysearch.java, source/net/yacy/data/BookmarksDB.java, source/net/yacy/search/AutoSearch.java, source/net/yacy/search/Switchboard.java
Sat May 02 02:36:18 CEST 2015
by reger
break out of NormalizeDistributor loop on timeout
Changed Files: source/net/yacy/search/ranking/ReferenceOrder.java
Fri May 01 19:24:14 CEST 2015
by reger
harmonize filesearch input box layout
Changed Files: htroot/yacyinteractive.html
Thu Apr 30 00:01:11 CEST 2015
by reger
upd to metadata-extractor-2.8.1
Changed Files: .classpath, build.xml, lib/metadata-extractor-2.8.1.License, lib/metadata-extractor-2.8.1.jar, nbproject/project.xml, pom.xml
Wed Apr 29 01:53:04 CEST 2015
by reger
upd to poi-3.11.jar
Changed Files: .classpath, build.xml, lib/poi-3.11-20141221.jar, lib/poi-3.11.License, lib/poi-scratchpad-3.11-20141221.jar, nbproject/project.xml, pom.xml
Tue Apr 28 03:12:14 CEST 2015
by reger
fix typo in image filter query
(extra bracket)
Changed Files: source/net/yacy/search/query/QueryGoal.java
Mon Apr 27 22:38:40 CEST 2015
by reger
fix String out of range in Collection Nav
see http://mantis.tokeek.de/view.php?id=573
Changed Files: source/net/yacy/search/query/QueryParams.java
Sun Apr 26 17:42:39 CEST 2015
by reger
improve character encoding for urlproxy servlet
for none utf-8 pages
Changed Files: source/net/yacy/http/servlets/UrlProxyServlet.java
Sun Apr 26 17:41:05 CEST 2015
by reger
upd to jsoup-1.8.2
Changed Files: .classpath, build.xml, lib/jsoup-1.8.2.jar, nbproject/project.xml, pom.xml
Sun Apr 26 04:29:32 CEST 2015
by reger
make Quality of Service Servlet available to prioritize requests from local host
This assigns priorities to incoming requests. Higher priority numbers are served before lower.
(disabled by default in defaults/web.xml, 
uncomment or copy entry to DATA/Settings/web.xml)
Changed Files: defaults/web.xml, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/http/servlets/YaCyQoSFilter.java
Sat Apr 25 22:38:38 CEST 2015
by reger
correct typo in de.lng
Changed Files: locales/de.lng
Sat Apr 25 03:24:28 CEST 2015
by reger
upd parser calls in test cases
Changed Files: test/net/yacy/document/ParserTest.java, test/net/yacy/document/parser/htmlParserTest.java, test/net/yacy/document/parser/images/genericImageParserTest.java, test/net/yacy/document/parser/images/metadataImageParserTest.java, test/net/yacy/document/parser/pdfParserTest.java
Sat Apr 25 02:45:05 CEST 2015
by reger
add additional links to crawl queue pages
Changed Files: htroot/Crawler_p.html
Thu Apr 23 18:17:28 CEST 2015
by Michael Peter Christen
don't record dump generation calls since that
- is not a change of the index
- happens very often within self-backup strategies from the outside
(i.e. cronjobs)
Changed Files: htroot/IndexControlURLs_p.java
Tue Apr 21 10:13:57 CEST 2015
by Michael Peter Christen
Merge pull request #4 from dertuxmalwieder/master

Readme improvements
Changed Files: readme.mediawiki
Mon Apr 20 14:30:45 CEST 2015
by Sven Knurr
Readme improvements

Now GitHub should display it properly. Also, added OpenBSD.
Changed Files: readme.mediawiki
Mon Apr 20 10:24:34 CEST 2015
by Michael Peter Christen
Merge pull request #2 from Scarfmonster/master

English Synonyms and small fixes
Changed Files: addon/synonyms/mobythesaurus_en_yacy, htroot/CrawlProfileEditor_p.html, htroot/DictionaryLoader_p.html, htroot/DictionaryLoader_p.java, source/net/yacy/crawler/data/CrawlProfile.java
Mon Apr 20 10:23:39 CEST 2015
by Michael Peter Christen
Merge pull request #3 from shaman/master

How about normal font weight in the searched titles & fix RSS icon position?
Changed Files: htroot/env/base.css, htroot/yacysearch.html, locales/uk.lng, skins/27c3.css
Mon Apr 20 04:27:15 CEST 2015
by Eugene Kuligin
fix typos
Changed Files: locales/uk.lng
Mon Apr 20 04:24:50 CEST 2015
by Eugene Kuligin
add vertical margin to the search cloud block
Changed Files: htroot/env/base.css
Mon Apr 20 04:19:37 CEST 2015
by Eugene Kuligin
fix RSS icon displaying
Changed Files: htroot/yacysearch.html
Mon Apr 20 04:13:05 CEST 2015
by Eugene Kuligin
increase view space with normal font weight in the searched titles
Changed Files: skins/27c3.css
Mon Apr 20 00:01:14 CEST 2015
by reger
fix NPE in Vocabulary_p servlet
called w/o parameter
Changed Files: htroot/Vocabulary_p.java
Sun Apr 19 15:55:49 CEST 2015
by Ryszard Go?
fix for Accept '?' URLs column in Crawl Profile List
Changed Files: htroot/CrawlProfileEditor_p.html, source/net/yacy/crawler/data/CrawlProfile.java
Fri Apr 17 15:14:10 CEST 2015
by Ryszard Go?
SynonymLibrary status check fix for multiple files
Changed Files: htroot/DictionaryLoader_p.java
Fri Apr 17 16:15:35 CEST 2015
by Ryszard Go?
added English synonyms
Changed Files: addon/synonyms/mobythesaurus_en_yacy, htroot/DictionaryLoader_p.html, htroot/DictionaryLoader_p.java
Fri Apr 17 02:14:13 CEST 2015
by reger
skip redundant add. of keywords to text
search uses keywords as default search field
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Thu Apr 16 02:36:12 CEST 2015
by reger
put https port in peers dna
as we flag if a peer is accesible via https, we need to know the port if we want to use is (e.g. for interYaCy communication)
start to provide / tansport the port by recording it in peers dna.
- add https link on the Network.html lock symbol
Changed Files: htroot/Network.html, htroot/Network.java, source/net/yacy/peers/Seed.java, source/net/yacy/search/Switchboard.java
Wed Apr 15 02:16:53 CEST 2015
by reger
incoming connection count/text fix
improvement on http://mantis.tokeek.de/view.php?id=570
Changed Files: htroot/AccessTracker_p.html, htroot/AccessTracker_p.java, locales/de.lng
Tue Apr 14 10:33:43 CEST 2015
by Michael Peter Christen
Merge pull request #1 from Scarfmonster/master

Search navigation fix
Changed Files: htroot/yacysearchtrailer.html
Tue Apr 14 03:19:27 CEST 2015
by reger
add info text icon next to Augmented Browsing check-box
with hint to config page
Changed Files: htroot/ConfigSearchPage_p.html
Tue Apr 14 02:07:02 CEST 2015
by reger
show "Augmented Browsing" link in search result only if urlproxy allowed and option switched on in layout
(AugmentedBrowsing_p.html, ConfigSearchPage_p.html)
as user only gets a error page if the option is not enabled
Changed Files: htroot/yacysearchitem.java
Mon Apr 13 23:32:06 CEST 2015
by Ryszard Go?
Search navigation fix
Changed Files: htroot/yacysearchtrailer.html
Mon Apr 13 16:20:00 CEST 2015
by Michael Peter Christen
added parsing of contentprop attribute in html tags for
content='startDate' and content='endDate'. The value of these field is
now written to new solr fields startDates_dts and endDates_dts.
Changed Files: defaults/solr.collection.schema, source/net/yacy/cora/federate/solr/SolrType.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Mon Apr 13 16:18:15 CEST 2015
by Michael Peter Christen
reverted json syntax for facet results to version from january
Changed Files: htroot/yacysearchtrailer.java, htroot/yacysearchtrailer.json
Sun Apr 12 22:02:45 CEST 2015
by Michael Peter Christen
added parsing of dd, dt and article html fields. The parsed result is
written to special solr fields which are deactivated by default.
Changed Files: defaults/solr.collection.schema, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Sat Apr 11 13:00:32 CEST 2015
by Michael Peter Christen
more logging during start-up
Changed Files: source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/NoticedURL.java
Fri Apr 10 16:16:20 CEST 2015
by Michael Peter Christen
<experimental> added parsing of <article> html element.
Whenever such an element occurs, the complete content of all article
elements replaces the parsed <content> part of documents.
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java
Fri Apr 10 15:59:18 CEST 2015
by Michael Peter Christen
enhanced suggestions
Changed Files: htroot/suggest.java, source/net/yacy/data/DidYouMean.java
Fri Apr 10 15:10:18 CEST 2015
by Michael Peter Christen
replaced "fork me" banner with github banner
Changed Files: htroot/Status.html, htroot/env/grafics/forkme_right_green_007200.png
Wed Apr 08 22:42:30 CEST 2015
by reger
upd to httpcore 4.4.1
Changed Files: .classpath, build.xml, lib/httpcore-4.4.1.jar, nbproject/project.xml, pom.xml
Tue Apr 07 16:10:13 CEST 2015
by Michael Peter Christen
split query into filter query and text query to get better ranking
results and faster results
Changed Files: source/net/yacy/http/servlets/GSAsearchServlet.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java
Tue Apr 07 13:14:41 CEST 2015
by Michael Peter Christen
testing switching off cold searchers; maybe this brings performance
enhancements when using large facets
Changed Files: defaults/solr/solrconfig.xml
Tue Apr 07 13:13:58 CEST 2015
by Michael Peter Christen
when selecting collections in navigation, do show the un-selected
collections in search result. When selecting one of them in another
search, switch off the previously selected collection. This actually
turns the collection navigation modifier into a radio-button like
behaviour
Changed Files: htroot/yacysearchtrailer.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java
Tue Apr 07 00:10:42 CEST 2015
by reger
replace deprecated getIP with getIPs in AbstractRemoteHandler
Changed Files: source/net/yacy/http/AbstractRemoteHandler.java
Sat Apr 04 21:11:09 CEST 2015
by reger
include htroot (*.class) in maven clean 
harmonize antrun javac call in pom with build.xml
Changed Files: pom.xml
Sat Apr 04 00:24:16 CEST 2015
by reger
add SynonymLibrary status to DictionaryLoader_p servlet 
http://mantis.tokeek.de/view.php?id=564
Changed Files: htroot/DictionaryLoader_p.java
Thu Apr 02 13:27:47 CEST 2015
by Michael Peter Christen
refactoring of filter queries (several queries instead only one)
Changed Files: source/net/yacy/search/query/QueryParams.java
Thu Apr 02 02:10:00 CEST 2015
by reger
show location nav as selectable nav in search page layout
- switch automatically on upon load of geodata provider
- but allow switch on also without geodata file (and display the location nav if search result has lat/lon location)
Changed Files: htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/DictionaryLoader_p.java, htroot/yacysearchtrailer.java
Wed Apr 01 18:37:45 CEST 2015
by Michael Peter Christen
use a cursor hand on facet headline to show that this is clickable
Changed Files: htroot/yacysearchtrailer.html
Wed Apr 01 18:17:52 CEST 2015
by Michael Peter Christen
added an expansion option to search facets on result page:
- if less or equal of 8 facet options are present, they are shown by
default
- if more facet options are present, they are hidden
To view or hide all facets, just click on the facet header bar
Changed Files: htroot/yacysearchtrailer.html, htroot/yacysearchtrailer.java, source/net/yacy/http/servlets/GSAsearchServlet.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java
Wed Apr 01 01:57:56 CEST 2015
by reger
make location facet return results
for location nav facet of field coordinate_p does not return results, now using coordinate_p_0_coordinate as alternative to get facet counts. As the actual facet value is not used this should not harm any analysis (even if facet is a incomplete location).
If facet value is used in future likely *_geohash field could be introduced (for facet and other ... as transport value)
Changed Files: source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Sun Mar 29 07:12:23 CEST 2015
by reger
fix display and limit of max server connections after startup
(on restart value returned to default=50)
This has no effect on Jetty but the limit is still respected.
Changed Files: source/net/yacy/yacy.java
Sun Mar 29 05:48:54 CEST 2015
by reger
add err msg on failure during Load_rss
Changed Files: htroot/Load_RSS_p.html, htroot/Load_RSS_p.java
Sat Mar 28 03:05:21 CET 2015
by reger
correct percent encoding for '%' char
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Tue Mar 24 12:32:39 CET 2015
by Michael Peter Christen
added a new collection type 'dht' to all documents from the peer-to-peer
interface to distinguish rich and poor document data.
This also reverts some changes from commit
796770e070daf38289b594f4cbdc65b9ce0ca2b1 because the firstSeen database
is the wrong method to distinguish these types of data
Changed Files: htroot/yacy/crawlReceipt.java, htroot/yacy/transferURL.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java
Tue Mar 24 00:13:05 CET 2015
by reger
fix missing display of   CrawlerMonitor -> robots.txt Monitor
revert delete of file api/table_p.html see https://gitorious.org/yacy/rc1/commit/3ffe19b85cb6f2c1d26922eb61ca73a42096e8e9
(still used in this menu)
Changed Files: htroot/Table_RobotsTxt_p.html, htroot/api/table_p.html
Mon Mar 23 11:12:39 CET 2015
by Marc Nause
Updated Git links from Gitorious to Github.
Changed Files: addon/yacy-svn-4.spec, htroot/ConfigUpdate_p.html, htroot/Status.html, htroot/env/templates/header.template, htroot/env/templates/simpleheader.template, htroot/opensearchdescription.xml, libbuild/pom.xml, pom.xml, readme.txt
Mon Mar 23 03:57:47 CET 2015
by reger
prevent overwrite of crawled or received full documents by (newer) metadata
To protect rich index data (full resource) from overwriting by metadata gathered during remote search,
the newly introduced "firstSeen" index is used to differentiate between full-resource-doc and metadata,
as a "firstSeen" entry is only added on store's of full-resource-docs (during crawl or remote search).
Changed Files: source/net/yacy/peers/Protocol.java
Wed Mar 18 21:57:41 CET 2015
by otter
Fixes hanging FlushThread (see
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5447)
by replacing put() method by the more robust add() to
add a merge job to the queue.
Changed Files: source/net/yacy/kelondro/rwi/IODispatcher.java
Mon Mar 16 02:03:40 CET 2015
by reger
fix snippet containig number with comma as desmo point  http://mantis.tokeek.de/view.php?id=344
to keep it as one word (by altering the split regex)
- added sniipet test case with number
- regex for word split to match multiple splitcars
Changed Files: source/net/yacy/search/snippet/TextSnippet.java, test/net/yacy/search/snippet/TextSnippetTest.java
Sun Mar 15 22:31:47 CET 2015
by reger
fix error on *abc query input
http://mantis.tokeek.de/view.php?id=486
Changed Files: source/net/yacy/search/query/QueryModifier.java
Sun Mar 15 06:02:45 CET 2015
by reger
apply UTF-8 encoding
copied from escape()
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Sun Mar 15 03:37:32 CET 2015
by reger
fix for path with char code > 255 
(causing index out of bound exception)
+ test cas for it
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Sun Mar 15 00:46:07 CET 2015
by reger
fix url encoding for path see http://mantis.tokeek.de/view.php?id=559
So far we used same escape procedure for all parts of the url (which includes x-www-form-urlencoded for all url components)
Added capability to use different encoding rules for the different url components (through specific bitset for each component).
(this is inspired by org.apache.http.client and java.net.uri implementation).
- Added test case for  http://mantis.tokeek.de/view.php?id=559
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Wed Mar 11 19:36:23 CET 2015
by reger
try to fix hang on index blob merge on shutdown
http://mantis.tokeek.de/view.php?id=505
It happens but not able to reproduce. This change makes sure terminate signal is catched at end of currently running merge jobs
Changed Files: source/net/yacy/kelondro/rwi/IODispatcher.java
Wed Mar 11 01:05:14 CET 2015
by reger
fix url (path) %-decoding http://mantis.tokeek.de/view.php?id=519
- add test case for this
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/net/yacy/cora/document/id/MultiProtocolURLTest.java
Mon Mar 09 00:09:36 CET 2015
by reger
update configheuristics_p.html text 
to state current opensearch heuristic function
Changed Files: htroot/ConfigHeuristics_p.html, locales/de.lng, locales/ru.lng
Sun Mar 08 22:10:51 CET 2015
by reger
fix version conflict in pom
for commons-io
Changed Files: pom.xml
Sun Mar 08 21:49:23 CET 2015
by reger
exclude default search fields from  text copy to text_t
for metadata index documents (reduce text redundance)
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sun Mar 08 02:34:48 CET 2015
by reger
For remote crawlReceipt add document abstract/description
enhance the returned metadata returned to the originator by description_txt to improve fulltext search result hits.
Changed Files: source/net/yacy/peers/Protocol.java
Thu Mar 05 02:22:05 CET 2015
by reger
harmonize snippet computation
to considere description_txt always (solr hl & internal).
For now just added desc to text list for computation, could be further equalized with hl computation.
Changed Files: source/net/yacy/search/snippet/TextSnippet.java
Thu Mar 05 02:09:27 CET 2015
by reger
remove unused statement
Changed Files: htroot/PerformanceMemory_p.java
Mon Mar 02 13:10:05 CET 2015
by Michael Peter Christen
added special terms for on: - Date modifier: tomorrow, today; i.e.:
search for: "Berlin on:tomorrow" to find events happening tomorrow in
Berlin
Changed Files: source/net/yacy/document/DateDetection.java
Mon Mar 02 12:55:31 CET 2015
by Michael Peter Christen
generalized time period computations
Changed Files: htroot/Crawler_p.java, htroot/IndexDeletion_p.java, htroot/Table_API_p.java, htroot/api/timeline_p.java, htroot/yacysearchtrailer.java, source/net/yacy/cora/date/AbstractFormatter.java
Sun Mar 01 23:50:17 CET 2015
by reger
add test case for snippet html encoding check
Changed Files: test/net/yacy/search/snippet/TextSnippetTest.java
Sat Feb 28 19:48:29 CET 2015
by reger
upd pom
Changed Files: nbproject/project.xml, pom.xml
Sat Feb 28 19:02:18 CET 2015
by reger
postpone raw snippet html encoding upon use
instead of during init of snippet 
adressing http://mantis.tokeek.de/view.php?id=551
Changed Files: source/net/yacy/search/snippet/TextSnippet.java
Fri Feb 27 00:53:20 CET 2015
by reger
apply query parameter getQueryFields() to GSA servlet
Changed Files: source/net/yacy/http/servlets/GSAsearchServlet.java
Wed Feb 25 21:11:59 CET 2015
by Marc Nause
Next try to fix start script for OpenBSD.
Changed Files: startYACY.sh
Wed Feb 25 01:58:42 CET 2015
by reger
fix mimetype of rss items in rss parser
- remove self reference as anchor for items
Changed Files: source/net/yacy/document/parser/rssParser.java
Wed Feb 25 01:05:46 CET 2015
by Michael Peter Christen
enhanced date parsing time
Changed Files: source/net/yacy/document/DateDetection.java
Mon Feb 23 23:12:07 CET 2015
by reger
introduce getQueryFields to return default query fields (queryparamter QF)
calculated from boostfields config, making sure title, description, keywords and content is always searched.
- apply change to solrServlet makes sure every remote query uses at least all locally defined boost fields for search
- apply to local solr search
- simplify select query by using QF defaults
Changed Files: source/net/yacy/cora/federate/solr/Ranking.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java
Sun Feb 22 05:42:04 CET 2015
by reger
add description_txt to default query fields,
Dublin Core Metadata field extracted by most parsers.
Changed Files: defaults/yacy.init, htroot/RankingSolr_p.java
Sun Feb 22 05:31:56 CET 2015
by reger
add extracted description/subject to pptParser
Changed Files: source/net/yacy/document/parser/pptParser.java
Fri Feb 20 02:21:04 CET 2015
by reger
url unescape add check for inconsistent utf8 multibyte parsing
If the url contains special chars (like umlaute äöü) it's interpreted as multybyte char and actually not converted at all (removed).
Added a check if the multibyte convesion is not complete, just add the char as is.

This fixes http://mantis.tokeek.de/view.php?id=200
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Tue Feb 17 03:16:10 CET 2015
by reger
stop sending crawl receipts if receiver got offline
Changed Files: source/net/yacy/search/Switchboard.java
Mon Feb 16 01:20:12 CET 2015
by reger
upd lucene api doc link
Changed Files: htroot/ContentAnalysis_p.html
Mon Feb 16 00:50:16 CET 2015
by reger
add extracted description/subject to docParser 
Changed Files: source/net/yacy/document/parser/docParser.java
Sun Feb 15 23:09:01 CET 2015
by reger
replace depreciated HTTPClient setStaleConnectionCheckEnabled with setValidateAfterInactivity()
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Sun Feb 15 21:34:01 CET 2015
by reger
replace depriciated HTTPClient ALLOW_ALL_HOSTNAME_VERIFIER with NoopHostnameVerifier()
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Sun Feb 15 20:39:20 CET 2015
by reger
fix formatting issue if snippet contains html code
replacement for reverted commit
https://gitorious.org/yacy/rc1/commit/61f42a792872c7fc83c7d4d64e1f0bb9d063c810
Changed Files: source/net/yacy/search/snippet/TextSnippet.java
Sat Feb 14 23:04:05 CET 2015
by reger
upd to commons-codec-1.10.jar, commons-compress-1.9.jar
Changed Files: .classpath, build.xml, lib/commons-codec-1.10.License, lib/commons-codec-1.10.jar, lib/commons-compress-1.9.License, lib/commons-compress-1.9.jar, nbproject/project.xml, pom.xml
Sat Feb 14 02:43:05 CET 2015
by reger
revert: formatting fix eats also up highlighting
need other solution for snippets with unwanted html code 
Changed Files: htroot/yacysearchitem.java
Fri Feb 13 00:50:32 CET 2015
by reger
upd to httpclient-4.4
Changed Files: .classpath, lib/httpclient-4.4.License, lib/httpclient-4.4.jar, lib/httpcore-4.4.License, lib/httpcore-4.4.jar, lib/httpmime-4.4.License, lib/httpmime-4.4.jar, nbproject/project.xml, pom.xml
Fri Feb 13 00:20:33 CET 2015
by reger
fix formatting issue in search result display 
if description contains html code
noticed e.g. for id=NmNdJ9uApLaQ  http://hswong3i.net/blog/hswong3i/virtualmin-drupal-7-x-ubuntu-12-04-howto
Changed Files: htroot/yacysearchitem.java
Wed Feb 11 23:26:39 CET 2015
by reger
allow/recognize host in file: protocol crawl target
This is useful in intranet indexing while crawling a intranet file server accessed via hostname while e.g. under Windows mapped to different drive letters on individual clients.
Here you can crawl e.g.  file://fileserver/documents having a valid uri in that intranet environment (while e.g. P:/documents might be client dependant).
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Wed Feb 11 01:43:02 CET 2015
by reger
fix parser test cases
(Vocabulary paramete)
Changed Files: test/net/yacy/document/ParserTest.java, test/net/yacy/document/parser/htmlParserTest.java, test/net/yacy/document/parser/images/genericImageParserTest.java, test/net/yacy/document/parser/images/metadataImageParserTest.java, test/net/yacy/document/parser/pdfParserTest.java, test/net/yacy/search/snippet/TextSnippetTest.java
Wed Feb 11 01:42:01 CET 2015
by reger
disable optimistic GC assumption in StandardMemoryStrategy
After several tests found that eom is not prevented. Major reason in testing was assumption future GC will free avg of last 5 GC.
Disabeling this check improved eom exceptions.

Added simplest testcase used for verification
Changed Files: source/net/yacy/kelondro/util/StandardMemoryStrategy.java, test/net/yacy/kelondro/util/MemoryControlTest.java
Tue Feb 10 08:43:45 CET 2015
by Michael Peter Christen
the cleanup process experienced a 100% CPU load situation and the loop
did not terminate:

Occurrences: 100
at java.util.HashMap$KeyIterator.next(HashMap.java:956)
at
net.yacy.cora.protocol.ConnectionInfo.cleanup(ConnectionInfo.java:300)
at
net.yacy.cora.protocol.ConnectionInfo.cleanUp(ConnectionInfo.java:293)
at net.yacy.search.Switchboard.cleanupJob(Switchboard.java:2212)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
net.yacy.kelondro.workflow.InstantBusyThread.job(InstantBusyThread.java:105)
at
net.yacy.kelondro.workflow.AbstractBusyThread.run(AbstractBusyThread.java:215)

This tries to fix the problem; the problem should be monitored
Changed Files: source/net/yacy/cora/protocol/ConnectionInfo.java
Mon Feb 09 18:46:06 CET 2015
by Michael Peter Christen
hack to make date detection faster (while it becomes a bit incomplete
regarding language alternatives)
Changed Files: source/net/yacy/document/DateDetection.java
Mon Feb 09 18:45:07 CET 2015
by Michael Peter Christen
enhanced suggest function
Changed Files: htroot/suggest.java, htroot/yacysearch.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/data/DidYouMean.java
Sun Feb 08 23:40:33 CET 2015
by reger
fix Umlaut handling in blekko heuristic search term
http://mantis.tokeek.de/view.php?id=169
observation: blekko seams to block xxxbot agents (=0 results)
Changed Files: defaults/heuristicopensearch.conf, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java
Sat Feb 07 22:01:54 CET 2015
by reger
url with semicolon or comma handling in proxy request
apply patch supplied with bugreport http://mantis.tokeek.de/view.php?id=540
Changed Files: source/net/yacy/http/ProxyHandler.java
Sat Feb 07 13:47:15 CET 2015
by sixcooler
small correction for last commit
Changed Files: .classpath
Sat Feb 07 00:37:43 CET 2015
by reger
upd error message for proxy
fix http://mantis.tokeek.de/view.php?id=539
Changed Files: source/net/yacy/http/AbstractRemoteHandler.java
Wed Feb 04 11:37:07 CET 2015
by Michael Peter Christen
remove remote indexing option in crawl start if not in p2p mode
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java
Wed Feb 04 03:51:34 CET 2015
by reger
adjust table column width to not line wrap crawler traffic line
Changed Files: htroot/Crawler_p.html
Wed Feb 04 01:50:35 CET 2015
by Michael Peter Christen
cloning a crawl now accepts the class name of vocabulary scapers
Changed Files: htroot/CrawlStartExpert.java
Wed Feb 04 01:12:25 CET 2015
by Michael Peter Christen
configuration option for maxload limit for remote search
Changed Files: defaults/yacy.init, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/SwitchboardConstants.java
Tue Feb 03 03:08:34 CET 2015
by reger
add shortMemory check to heuristic search 
and skip operation on shortMemory (no request to remote openserch systems)
Changed Files: source/net/yacy/cora/federate/FederateSearchManager.java
Sun Feb 01 05:35:09 CET 2015
by reger
fix: malformed filename in image search
fix for http://mantis.tokeek.de/view.php?id=533
Changed Files: htroot/yacysearchitem.java
Sun Feb 01 04:26:33 CET 2015
by reger
refactor: just some more useages of constant for term ":[* TO *]"
Changed Files: htroot/HostBrowser.java, htroot/RankingSolr_p.java, source/net/yacy/cora/federate/solr/logic/CatchallLiteral.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sun Feb 01 00:29:28 CET 2015
by reger
remove hardcoded initialization of language nav if not used
Changed Files: source/net/yacy/search/query/SearchEvent.java
Fri Jan 30 21:17:23 CET 2015
by Marc Nause
Added & in start script for *NIX which was lost a few commits ago.
Changed Files: startYACY.sh
Thu Jan 29 11:39:47 CET 2015
by Michael Peter Christen
refactoring of autotagging code (combined same code pieces)
Changed Files: source/net/yacy/cora/lod/vocabulary/Tagging.java
Thu Jan 29 02:45:32 CET 2015
by Michael Peter Christen
enhanced initialization speed of vocabularies by using better
normalization and by removal of unused data structures
Changed Files: source/net/yacy/cora/lod/vocabulary/Tagging.java
Thu Jan 29 02:22:28 CET 2015
by Michael Peter Christen
using precompiled CommonPattern.TAB for split
Changed Files: source/net/yacy/cora/geo/GeonamesLocation.java, source/net/yacy/cora/util/CommonPattern.java, source/net/yacy/document/parser/csvParser.java
Thu Jan 29 02:19:41 CET 2015
by Michael Peter Christen
using precompiled pattern CommonPattern.SEMICOLON for splits
Changed Files: htroot/CookieTest_p.java, htroot/Vocabulary_p.java, source/net/yacy/data/UserDB.java, source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/parser/csvParser.java, source/net/yacy/document/parser/html/ScraperInputStream.java, source/net/yacy/document/parser/vcfParser.java
Thu Jan 29 02:16:42 CET 2015
by Michael Peter Christen
persistency for vocabulary facet switch
Changed Files: htroot/Vocabulary_p.html, htroot/Vocabulary_p.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/serverSwitch.java
Thu Jan 29 01:53:36 CET 2015
by Michael Peter Christen
introducting a new getConfig method which parses comma-separated llists
from setting fields; refactoring for all places where such lists are
parsed
Changed Files: htroot/IndexControlRWIs_p.java, htroot/yacysearch.java, source/net/yacy/server/http/HTTPDFileHandler.java, source/net/yacy/server/serverSwitch.java
Thu Jan 29 01:35:28 CET 2015
by Michael Peter Christen
refactoring with CommonPattern.COMMA
Changed Files: defaults/yacy.init, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/data/ListManager.java, source/net/yacy/kelondro/index/Row.java
Thu Jan 29 01:22:28 CET 2015
by Michael Peter Christen
do not reindex based on vocabulary fields (there are meanwhile many of
them) and some default settings
Changed Files: defaults/yacy.init, source/net/yacy/migration.java
Thu Jan 29 00:33:07 CET 2015
by Michael Peter Christen
refactoring of reindexSolr (just replaced constant string)
Changed Files: htroot/IndexReIndexMonitor_p.java, source/net/yacy/migration.java, source/net/yacy/search/index/ReindexSolrBusyThread.java, source/net/yacy/server/serverSwitch.java
Wed Jan 28 03:59:01 CET 2015
by reger
Allow to hide linkstructure graphic in crawl monitor 
using/setting the config param DECORATION_GRAFICS_LINKSTRUCTURE
Changed Files: htroot/Crawler_p.html, htroot/Crawler_p.java, htroot/HostBrowser.html, htroot/HostBrowser.java
Tue Jan 27 17:00:20 CET 2015
by Michael Peter Christen
removed some warnings
Changed Files: htroot/ViewImage.java, htroot/api/snapshot.java, source/net/yacy/cora/federate/AbstractFederateSearchConnector.java, source/net/yacy/cora/federate/FederateSearchManager.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/search/AutoSearch.java, source/net/yacy/search/snippet/ResultEntry.java, source/net/yacy/server/http/HTTPDemon.java
Tue Jan 27 16:53:09 CET 2015
by Michael Peter Christen
the LinkedBlockingQueue is much faster than the ArrayBlockingQueue
(strange but this is the result of a test:
ArrayBlockingQueue: 39461 lines / second;
LinkedBlockingQueue: 60774 lines / second)
Changed Files: source/net/yacy/cora/language/synonyms/SynonymLibrary.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/cora/storage/Files.java, source/net/yacy/crawler/retrieval/URLRewriterLibrary.java
Sat Jan 24 23:17:07 CET 2015
by reger
fix: eom on parsing ico file by genericImageParser

trace: java.lang.OutOfMemoryError: Java heap space
	at java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75)
	at java.awt.image.Raster.createPackedRaster(Raster.java:467)
	at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
	at java.awt.image.BufferedImage.<init>(BufferedImage.java:331)
	at net.yacy.document.parser.images.bmpParser$IMAGEMAP.<init>(bmpParser.java:149)
	at net.yacy.document.parser.images.bmpParser.parse(bmpParser.java:69)
	at net.yacy.document.parser.images.genericImageParser.parse(genericImageParser.java:116)
Changed Files: source/net/yacy/document/parser/images/genericImageParser.java
Sat Jan 24 07:10:14 CET 2015
by Michael Peter Christen
update to latest code changes from json.org
Changed Files: source/net/yacy/cora/util/JSONArray.java, source/net/yacy/cora/util/JSONException.java, source/net/yacy/cora/util/JSONObject.java, source/net/yacy/cora/util/JSONString.java, source/net/yacy/cora/util/JSONTokener.java
Sat Jan 24 01:53:58 CET 2015
by reger
Let auto-disabled crawls recover if low resource condition vanished.
Analog to autodisabled DHT switch autodisabled crawls back on upon mem ok
by remembering the autodisable by conf parameter.
Changed Files: pom.xml, source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/SwitchboardConstants.java
Fri Jan 23 17:57:54 CET 2015
by Michael Peter Christen
write java version to status page
Changed Files: htroot/Status.java, htroot/Status_p.inc
Fri Jan 23 11:31:05 CET 2015
by Michael Peter Christen
new development cycle
Changed Files: build.properties