YaCy Release current_development

Major Changes   
Jump to: Bugfixes / Other Changes

CommitDescription
Thu Aug 24 18:47:18 CEST 2017
by luccioman
Removed some unnecessary uses of java.lang.reflect api.

This improves code browsing and readability, making search by references
or call hierarchy IDE features more accurate.
Changed Files: htroot/ConfigBasic.java, htroot/api/ymarks/import_ymark.java, source/net/yacy/contentcontrol/ContentControlFilterUpdateThread.java, source/net/yacy/contentcontrol/SMWListSyncThread.java, source/net/yacy/kelondro/workflow/InstantBusyThread.java, source/net/yacy/kelondro/workflow/OneTimeBusyThread.java, source/net/yacy/peers/OnePeerPingBusyThread.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/SearchEvent.java
Mon Aug 21 09:38:20 CEST 2017
by luccioman
Improved parsing support for OOXML spreadsheets (.xlsx)

As reported edycop in mantis 765 (
http://mantis.tokeek.de/view.php?id=765 ), parsing of xlsx files was
quite incomplete.
Now properly support "Shared String Table" entry in Office Open XML
spreadsheets, an also detect embedded URLs.

Integrating the Apache poi-ooxml library could be an option for finer
OOXML formats support, but their SAX style parsing example (
http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api ) tends to
show that a custom SAX handler is still efficient for lightweight and
low memory footprint processing.
Changed Files: source/net/yacy/document/parser/ooxmlParser.java, source/net/yacy/document/parser/xml/GenericXMLContentHandler.java, source/net/yacy/document/parser/xml/OOXMLSharedStringsHandler.java, source/net/yacy/document/parser/xml/OOXMLSpreeadsheetHandler.java, test/java/net/yacy/document/ParserTest.java, test/java/net/yacy/document/parser/ooxmlParserTest.java, test/parsertest/umlaute_linux.ppsx, test/parsertest/umlaute_linux.xlsx
Mon Aug 14 14:57:58 CEST 2017
by luccioman
Implemented partial stream parsing of tar archives.

Also added JUnit tests for the tar parser and fixed unwanted use of the
tar parser as a fallback on files included in a tar archive.
Changed Files: source/net/yacy/document/parser/tarParser.java, test/java/net/yacy/document/parser/tarParserTest.java, test/parsertest/umlaute_dc_xml_iso.xml, test/parsertest/umlaute_dc_xml_utf8.xml, test/parsertest/umlaute_html_iso.html, test/parsertest/umlaute_html_utf8.html, test/parsertest/umlaute_html_xml_txt_gnu.tar, test/parsertest/umlaute_html_xml_txt_pax.tar, test/parsertest/umlaute_html_xml_txt_ustar.tar, test/parsertest/umlaute_html_xml_txt_v7.tar, test/parsertest/umlaute_linux.txt
Fri Aug 11 20:50:36 CEST 2017
by luccioman
Fixed missing transitive dependency to commons-collections4-4.1

Dependency required by poi-3.16. 

Dependency was not provided in YaCy but already defined on previous poi
versions. This only became problematic since upgrade from poi-3.15 to
poi-3.16 (commit dedc6552d37b5e877258abddac9621f7fe75bf9b). Indeed in
this new poi release, a poi component used in some YaCy parsers code
paths now explicitely needs a class from the commons-collections4
library : org.apache.poi.hpsf.Section uses now
org.apache.commons.collections4.bidimap.TreeBidiMap.

Impacted YaCy parsers : xlsParser, pptParser, docParser.

Issue detected by the folowing JUnit tests failing :
ParserTest.testpptParsers(), ParserTest.testdocParsers(),
xlsParserTest.testParse()
Changed Files: .classpath, lib/commons-collections4-4.1.License, lib/commons-collections4-4.1.jar
Sat Jul 08 09:04:03 CEST 2017
by luccioman
Started support of partial parsing on large streamed resources.

Thus enable getpageinfo_p API to return something in a reasonable amount
of time on resources over MegaBytes size range.
Support added first with the generic XML parser, for other formats
regular crawler limits apply as usual. 
Changed Files: htroot/api/getpageinfo_p.java, source/net/yacy/crawler/retrieval/StreamResponse.java, source/net/yacy/document/AbstractParser.java, source/net/yacy/document/Document.java, source/net/yacy/document/Parser.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/GenericXMLParser.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/xml/GenericXMLContentHandler.java, source/net/yacy/kelondro/util/FileUtils.java, source/net/yacy/repository/LoaderDispatcher.java, test/java/net/yacy/document/parser/GenericXMLParserTest.java, test/java/net/yacy/document/parser/html/ContentScraperTest.java
Sat Jul 01 23:58:28 CEST 2017
by reger
upd to Jetty 9.4.6.v20170531
Modify loginservice to the changes in Jetty, partially based on pull 
request #101 https://github.com/yacy/yacy_search_server/pull/101 bu @automenta
Changed Files: .classpath, build.xml, htroot/ConfigUser_p.java, lib/jetty-client-9.4.6.v20170531.jar, lib/jetty-continuation-9.4.6.v20170531.jar, lib/jetty-deploy-9.4.6.v20170531.jar, lib/jetty-http-9.4.6.v20170531.jar, lib/jetty-io-9.4.6.v20170531.jar, lib/jetty-jmx-9.4.6.v20170531.jar, lib/jetty-proxy-9.4.6.v20170531.jar, lib/jetty-security-9.4.6.v20170531.jar, lib/jetty-server-9.4.6.v20170531.jar, lib/jetty-servlet-9.4.6.v20170531.jar, lib/jetty-servlets-9.4.6.v20170531.jar, lib/jetty-util-9.4.6.v20170531.jar, lib/jetty-webapp-9.4.6.v20170531.jar, lib/jetty-xml-9.4.6.v20170531.jar, lib/jetty.License, pom.xml, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/http/MonitorHandler.java, source/net/yacy/http/YaCyLegacyCredential.java, source/net/yacy/http/YaCyLoginService.java
Tue Jun 27 06:42:33 CEST 2017
by luccioman
Ensure lower case conversion consistency with any default locale.

Especially for Turkish speaking users using "tr" as their system default
locale : strings for technical stuff (URLs, tag names, constants...)
must not be lower cased with the default locale, as 'I' doesn't becomes
'i' like in other locales such as "en", but becomes '?'.
Changed Files: htroot/ConfigHeuristics_p.java, htroot/Crawler_p.java, htroot/api/blacklists/add_entry_p.java, htroot/api/blacklists/delete_entry_p.java, htroot/api/getpageinfo_p.java, htroot/api/ymarks/add_ymark.java, source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/data/wiki/WikiCode.java, source/net/yacy/document/Document.java, source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/document/parser/html/TransformerWriter.java, source/net/yacy/document/parser/rdfa/impl/RDFaTripleImpl.java, source/net/yacy/gui/framework/Browser.java, source/net/yacy/http/AbstractRemoteHandler.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/util/Formatter.java, source/net/yacy/kelondro/util/ISO639.java, source/net/yacy/kelondro/util/OS.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/operation/yacyRelease.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java, source/net/yacy/server/serverObjects.java, source/net/yacy/utils/translation/TranslatorXliff.java, source/net/yacy/yacy.java, test/java/net/yacy/document/parser/htmlParserTest.java
Mon Jun 26 16:30:21 CEST 2017
by luccioman
Added a generic XML parser, able to parse elements text and URLs.

This parser adds support for any XML based format other than already
supported XML vocabularies such XHTML, RSS/Atom feeds... It will
eventually be used as a fallback if one of these specific parsers fail,
before falling back to the existing genericParser which extracts not
that much useful information except URL tokens.
Changed Files: source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/GenericXMLParser.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/xml/GenericXMLContentHandler.java, source/net/yacy/kelondro/io/CharBuffer.java, test/java/net/yacy/document/parser/GenericXMLParserTest.java, test/parsertest/umlaute_dc_xml_iso.xml, test/parsertest/umlaute_dc_xml_utf8.xml
Tue Jun 20 09:21:55 CEST 2017
by luccioman
Cleaned up memory usage page HTML

- fixed validation errors
- removed deprecated attributes
- improved accessibility with richer table semantics (headers and
caption elements) and language declaration
Changed Files: htroot/PerformanceMemory_p.html, locales/cn.lng, locales/de.lng, locales/fr.lng, locales/master.lng.xlf, locales/ru.lng, locales/sk.lng, locales/uk.lng
Wed Jun 14 09:13:50 CEST 2017
by luccioman
Limit the synchronization blocking time on some Cache operations.

Using a Reentrant lock instead of the intrinsic synchronization lock
permits limiting the blocking time to acquire a lock.

Useful on a very busy Cache concurrently accessed by many threads : when
the time to acquire a lock is too high, getting/storing content on the
cache becomes inefficient, and it is then better to fall back to loading
remote resources.

Illustrated by the CacheTest stress test and some traces reported in
mantis 751 ( http://mantis.tokeek.de/view.php?id=751 )
Changed Files: source/net/yacy/crawler/data/Cache.java, source/net/yacy/kelondro/blob/ArrayStack.java, source/net/yacy/kelondro/blob/Compressor.java, source/net/yacy/search/Switchboard.java, test/java/net/yacy/crawler/data/CacheTest.java
Fri Jun 09 12:25:23 CEST 2017
by Michael Peter Christen
migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8
Also: now Version 1.921
Changed Files: .classpath, .settings/org.eclipse.jdt.core.prefs, build.properties, build.xml, defaults/solr/schema.xml, defaults/solr/solrconfig.xml, htroot/yacysearchtrailer.java, lib/commons-math3-3.4.1.jar, lib/lucene-analyzers-common-6.6.0.jar, lib/lucene-analyzers-phonetic-6.6.0.jar, lib/lucene-backward-codecs-6.6.0.jar, lib/lucene-classification-6.6.0.jar, lib/lucene-codecs-6.6.0.jar, lib/lucene-core-6.6.0.jar, lib/lucene-facet-6.6.0.jar, lib/lucene-grouping-6.6.0.jar, lib/lucene-highlighter-6.6.0.jar, lib/lucene-join-6.6.0.jar, lib/lucene-memory-6.6.0.jar, lib/lucene-misc-6.6.0.jar, lib/lucene-queries-6.6.0.jar, lib/lucene-queryparser-6.6.0.jar, lib/lucene-spatial-6.6.0.jar, lib/lucene-suggest-6.6.0.jar, lib/metrics-core-3.2.2.jar, lib/solr-core-6.6.0.jar, lib/solr-dataimporthandler-6.6.0.jar, lib/solr-solrj-6.6.0.jar, lib/spatial4j-0.6.jar, lib/zookeeper-3.4.10.jar, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/cora/federate/solr/instance/ServerMirror.java, source/net/yacy/cora/federate/solr/instance/ServerShard.java, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/FlatJSONResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/GSAResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/GrepHTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/SnapshotImagesReponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java, source/net/yacy/http/servlets/GSAsearchServlet.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, test/java/net/yacy/document/DateDetectionTest.java
Sat Jun 03 04:00:46 CEST 2017
by luccioman
Ensure file input streams proper closing in both success and failures

Also add when possible a warning level log message on input stream
closing error instead of failing silently. This could help understanding
some IO exceptions such as "too many files open".
Changed Files: source/net/yacy/document/parser/images/bmpParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/icoParser.java, source/net/yacy/gui/framework/Switchboard.java, source/net/yacy/kelondro/blob/Gap.java, source/net/yacy/kelondro/blob/HeapReader.java, source/net/yacy/kelondro/index/RowHandleMap.java, source/net/yacy/kelondro/index/RowHandleSet.java, source/net/yacy/kelondro/util/FileUtils.java, source/net/yacy/kelondro/util/SetTools.java, source/net/yacy/kelondro/util/XMLTables.java, source/net/yacy/repository/Blacklist.java, source/net/yacy/search/AutoSearch.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/http/TemplateEngine.java, source/net/yacy/utils/PKCS12Tool.java, source/net/yacy/utils/cryptbig.java, source/net/yacy/utils/tarTools.java, source/net/yacy/utils/translation/TranslationManager.java, test/java/net/yacy/document/parser/htmlParserTest.java, test/java/net/yacy/document/parser/images/genericImageParserTest.java, test/java/net/yacy/document/parser/images/metadataImageParserTest.java, test/java/net/yacy/document/parser/pdfParserTest.java
Fri Jun 02 12:14:29 CEST 2017
by luccioman
Ensure proper closing of file input streams.
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/cora/geo/OpenGeoDBLocation.java, source/net/yacy/cora/protocol/ftp/FTPClient.java, source/net/yacy/cora/storage/Files.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/data/Translator.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/Document.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/utils/CryptoLib.java, source/net/yacy/utils/PKCS12Tool.java, source/net/yacy/utils/cryptbig.java, source/net/yacy/utils/gzip.java, source/net/yacy/yacy.java, test/java/net/yacy/document/ParserTest.java, test/java/net/yacy/document/parser/xlsParserTest.java
Fri Jun 02 01:00:21 CEST 2017
by reger
Introduce keyword query parameter 
This enables keyword navigator to filter on keywords. Added search page
output and layout config for keywords, allowing e.g. in Intranet use
to display the keywords. No styling or links applied to the keyword
text (but is desirable possibly in combination with bootstrap-tagsinput
for future/intranet).
Changed Files: defaults/yacy.init, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/index.html, htroot/yacysearchitem.html, htroot/yacysearchitem.java, source/net/yacy/search/navigator/StringNavigator.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Mon May 15 13:15:16 CEST 2017
by luccioman
Added user interface feedback on results feeding termination status.

Added as an additional icon with title in the search progress bar, to
inform about background search feeder threads terminated or still
running. While giving a bit more information to users about the p2p
search process, this can help choosing whether or not wait a little bit
more time before going to the next page, in order to get results from
various sources sorted as best as possible (see #91 for a discussion
about sorting accuracy and network latency).

Other related modifications included :
 - regular updates to statistics in the progress bar until the
background feeders are completely terminated.
 - removed some uses of unsecure and discouraged JavaScript elements
Changed Files: htroot/js/yacysearch.js, htroot/yacysearch.html, htroot/yacysearchitem.html, htroot/yacysearchitem.java, htroot/yacysearchlatestinfo.java, htroot/yacysearchlatestinfo.json, source/net/yacy/search/query/SearchEvent.java
Thu May 11 18:02:33 CEST 2017
by luccioman
Improved previous merge "Show ranking in HTML UI".

- added the new setting as configurable in the "Debug/Analysis" settings
page. Debug/analysis is its main purpose for now as there is currently
no nice and "understansable" ranking score info servlet (see forum
discussion http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5884 ) 
- render in the "Search Page Layout" page preview when enabled
- added constants
Changed Files: defaults/yacy.init, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/SettingsAck_p.java, htroot/Settings_Debug.inc, htroot/Settings_p.java, htroot/yacysearchitem.html, source/net/yacy/search/SwitchboardConstants.java
Fri Apr 14 14:32:44 CEST 2017
by luccioman
Extended Mediawiki dump import to remote URLs.

When using a public HTTP URL in /IndexImportMediawiki_p.html, the remote
file now is directly streamed and processed, allowing import of several
GB dumps even with a low memory remote peer, and without need to
manually download the dump file first.
Changed Files: bin/importmediawiki.sh, htroot/IndexImportMediawiki_p.html, htroot/IndexImportMediawiki_p.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/crawler/retrieval/FileLoader.java, source/net/yacy/crawler/retrieval/SMBLoader.java, source/net/yacy/document/importer/MediawikiImporter.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/index/DocumentIndex.java
Thu Apr 06 21:18:01 CEST 2017
by reger
upd to Solr-5.5.4
Changed Files: .classpath, build.xml, lib/lucene-analyzers-common-5.5.4.jar, lib/lucene-analyzers-phonetic-5.5.4.jar, lib/lucene-backward-codecs-5.5.4.jar, lib/lucene-classification-5.5.4.jar, lib/lucene-codecs-5.5.4.jar, lib/lucene-core-5.5.4.jar, lib/lucene-facet-5.5.4.jar, lib/lucene-grouping-5.5.4.jar, lib/lucene-highlighter-5.5.4.jar, lib/lucene-join-5.5.4.jar, lib/lucene-memory-5.5.4.jar, lib/lucene-misc-5.5.4.jar, lib/lucene-queries-5.5.4.jar, lib/lucene-queryparser-5.5.4.jar, lib/lucene-spatial-5.5.4.jar, lib/lucene-suggest-5.5.4.jar, lib/solr-core-5.5.4.jar, lib/solr-solrj-5.5.4.jar, pom.xml
Tue Apr 04 00:59:26 CEST 2017
by reger
upd to pdfbox-2.0.5.jar and transient dependency xmpcore-5.1.3.jar
required by metadata-extractor-2.10.1 (fix build.xml compiler warning)
Changed Files: .classpath, build.xml, lib/fontbox-2.0.5.License, lib/fontbox-2.0.5.jar, lib/pdfbox-2.0.5.License, lib/pdfbox-2.0.5.jar, lib/xmpcore-5.1.3.jar, lib/xmpcore-5.1.3.license, pom.xml
Mon Apr 03 11:34:49 CEST 2017
by luccioman
Set Config Portal as a private administration page.

Consistently with its required action from submission credentials, and
because external unauthenticated users do not need to access these
settings.
Changed Files: defaults/yacy.init, htroot/ConfigAppearance_p.html, htroot/ConfigPortal.java, htroot/ConfigPortal_p.html, htroot/ConfigPortal_p.java, htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/env/templates/header.template, htroot/env/templates/submenuPortalConfiguration.template, locales/cn.lng, locales/de.lng, locales/fr.lng, locales/hi.lng, locales/ja.lng, locales/master.lng.xlf, locales/ru.lng, locales/uk.lng, source/net/yacy/http/servlets/GSAsearchServlet.java
Fri Mar 31 00:58:11 CEST 2017
by reger
Implement surrogate import from Warc archives (as first option handle
warc = Web ARChive File Format.
Warc files with extension .warc or compressed warc.gz can be placed in the
DATA/surrogate/in and contained responses are imported to the index.
The used library is stream based so we can easily extend it later to use
and load warc's from the net.
Changed Files: .classpath, build.xml, lib/jwat-archive-common-1.0.4.jar, lib/jwat-common-1.0.4.jar, lib/jwat-gzip-1.0.4.jar, lib/jwat-warc-1.0.4.jar, pom.xml, source/net/yacy/document/importer/WarcImporter.java, source/net/yacy/search/Switchboard.java
Sun Mar 26 11:48:00 CEST 2017
by luccioman
Enforced access controls on some administrative actions.

 - ensure use of HTTP POST method : HTTP GET should only be used for
information retrieval and not to perform server side effect operations
(see HTTP standard https://tools.ietf.org/html/rfc7231#section-4.2.1)
 - a transaction token is now required for these administrative form
submissions to ensure the request can not be included in an external
site and performed silently/by mistake by the user browser
Changed Files: bin/clearall.sh, bin/clearcache.sh, bin/clearindex.sh, bin/deleteurl.sh, bin/passwd.sh, bin/protectedPostApiCall.sh, htroot/ConfigAccounts_p.html, htroot/ConfigAccounts_p.java, htroot/ConfigProperties_p.html, htroot/ConfigProperties_p.java, htroot/ConfigUpdate_p.html, htroot/ConfigUpdate_p.java, htroot/IndexControlRWIs_p.html, htroot/IndexControlRWIs_p.java, htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, htroot/IndexDeletion_p.html, htroot/IndexDeletion_p.java, htroot/IndexFederated_p.html, htroot/IndexFederated_p.java, htroot/PerformanceQueues_p.html, htroot/PerformanceQueues_p.java, htroot/Performance_p.html, htroot/Steering.html, htroot/Steering.java, htroot/env/templates/header.template, htroot/terminal_p.html, source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/data/BadTransactionException.java, source/net/yacy/data/TransactionManager.java, source/net/yacy/http/servlets/DisallowedMethodException.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/yacy.java, stopYACY.sh
Tue Mar 21 17:15:01 CET 2017
by luccioman
Updated shell scripts to be compatible with HTTP Digest authentication

Because curl and wget do not let use a hashed password as parameter,
YaCy shell scripts which require authentication are now interactive by
default when HTTP Digest is the only available authentication method.
Batch mode can still be available trough the use of an environment
variable : YACY_ADMIN_PASSWORD.  

Other improvements :
 - added backward compatibility for Basic Authentication
 - fixed curl/wget presence detection 
 - do not return with exit code 0 when an API call failed, and print an
error message when the case occurs
 - documented available authentication options for API calls
Changed Files: bin/apicall.sh, bin/apicat.sh, bin/down.sh, bin/passwd.sh, bin/search1.sh, stopYACY.sh
Sun Mar 19 02:30:08 CET 2017
by reger
Introduce the option to configure a shutdown port.
A port value of -1 will disable this option.

If set to a value greater 0, YaCy listens on this of on the local loopback 
address (127.0.0.1) for a shutdown or restart signal.
E.g. connect to http://localhost:8005/shutdown will stop the YaCy server.
http://localhost:8005/restart will restart it.
This option allows to stop YaCy locally independant from the web web 
frontend (which might be configured for password protected remote access).



Changed Files: defaults/yacy.init, htroot/SettingsAck_p.html, htroot/SettingsAck_p.java, htroot/Settings_ServerAccess.inc, htroot/Settings_p.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/server/serverSwitch.java
Sat Mar 18 20:02:26 CET 2017
by reger
add switchboardconstants for server ports config keys
Changed Files: htroot/ConfigBasic.java, htroot/QuickCrawlLink_p.java, htroot/SettingsAck_p.java, htroot/api/snapshot.java, source/net/yacy/gui/Tray.java, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/migration.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/Seed.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/utils/upnp/UPnP.java, source/net/yacy/yacy.java
Tue Feb 28 18:11:54 CET 2017
by luccioman
Privacy enhancement : added settings to control referrer policy.

HTTP "Referer" header sent by the browser when using YaCy can now be
controlled either with the referrer meta tag as a global policy, or only
for search result links by adding the attribute rel="noreferrer".

To improve privacy with the less possible regressions, the default is
set as meta tag with value "origin-when-cross-origin" : internal YaCy
links behavior is not affected, but when visiting external websites
referrer url is not empty but stripped from query parameters and path.

Older browsers, Safari, MS IE and Edge do not support the referrer meta
tag, so the standard but less flexible noreferrer link type can also be
enabled as an alternative.

User-friendly settings page to be implemented.
Changed Files: defaults/yacy.init, htroot/env/templates/metas.template, htroot/yacysearchitem.html, htroot/yacysearchitem.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/search/SwitchboardConstants.java
Mon Feb 20 10:48:07 CET 2017
by luccioman
Refactored and enforced Solr mandatory fields for proper operation

- Added a new method to check activation of mandatory fields on
Collection Configuration commit, consistently with checks previously
performed in Switchboard startup and with mandatory fields in the
default schema.
- Reorganized default schema and CollectionConfiguration enumeration :
moved no more mandatory fields in a specific section, and moved fields
enabled at startup to the mandatory section. 
- Marked mandatory fields as required and with stronger font in the
IndexSchema_p.html page
Changed Files: defaults/solr.collection.schema, htroot/IndexSchema_p.html, htroot/IndexSchema_p.java, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java
Mon Feb 13 19:11:17 CET 2017
by luccioman
Added support for HTML OpenSearch results.

Many OpenSearch systems do not provide results as standard RSS/Atom
feeds but only as HTML. 

This modification add some support for custom OpenSearch HTML results
through the use of mapping files (as already done for federated Solr
search) relying on CSS-like selectors to retrieve information from HTML
content.

An example mapping file is provided to map results from the
www.npmjs.com OpenSearch URL.
Changed Files: defaults/federatecfg/npmjs.html.map.properties, defaults/heuristicopensearch.conf, source/net/yacy/cora/federate/AbstractFederateSearchConnector.java, source/net/yacy/cora/federate/FederateSearchManager.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/http/HTTPClient.java
Sat Feb 11 19:53:27 CET 2017
by reger
upd to Jetty-9.2.21.v20170120
Changed Files: .classpath, build.xml, lib/jetty-client-9.2.21.v20170120.jar, lib/jetty-continuation-9.2.21.v20170120.jar, lib/jetty-deploy-9.2.21.v20170120.jar, lib/jetty-http-9.2.21.v20170120.jar, lib/jetty-io-9.2.21.v20170120.jar, lib/jetty-jmx-9.2.21.v20170120.jar, lib/jetty-proxy-9.2.21.v20170120.jar, lib/jetty-security-9.2.21.v20170120.jar, lib/jetty-server-9.2.21.v20170120.jar, lib/jetty-servlet-9.2.21.v20170120.jar, lib/jetty-servlets-9.2.21.v20170120.jar, lib/jetty-util-9.2.21.v20170120.jar, lib/jetty-webapp-9.2.21.v20170120.jar, lib/jetty-xml-9.2.21.v20170120.jar, pom.xml
Thu Feb 09 11:05:06 CET 2017
by luccioman
Added a new Debug/Analysis advanced settings subsection.

As discussed in PR #93 with @JeremyRand and @reger24 this new advanced
settings page includes:
 - a new setting to control remote Solr responses encoding
 - some existing debug settings which could not be set through the admin
user interface
Changed Files: defaults/yacy.init, htroot/SettingsAck_p.html, htroot/SettingsAck_p.java, htroot/Settings_Debug.inc, htroot/Settings_p.html, htroot/Settings_p.java, source/net/yacy/cora/federate/SolrFederateSearchConnector.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/AutoSearch.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/Fulltext.java
Fri Jan 27 15:47:15 CET 2017
by luccioman
Added user-friendly controls over disk usage configuration settings.

As mentioned in issue #103, control settings over YaCy disk usage
already existed but lacked a user-friendly way to set them.

I added it to the Performance_p.html administration page with a little
refactoring on the "Resource Observer" fieldset for improved
accessibility and HTML standards respect.
Also added the possibility to enable/disable the autoregulation fonction
from this page.
Changed Files: htroot/PerformanceQueues_p.java, htroot/Performance_p.html, htroot/env/base.css, locales/cn.lng, locales/de.lng, locales/master.lng.xlf, locales/ru.lng, locales/uk.lng, source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/SwitchboardConstants.java
Sun Jan 22 23:58:46 CET 2017
by reger
Group all proxy settings on System Administration by adding settings of
UrlProxyAccss page (moved from deleted AugmentedBrowsing_p), adjust
submenu (remove Augmented Browsing) and translation files.
Changed Files: htroot/ConfigSearchPage_p.html, htroot/SettingsAck_p.html, htroot/SettingsAck_p.java, htroot/Settings_UrlProxyAccess.inc, htroot/Settings_p.html, htroot/Settings_p.java, htroot/Status_p.inc, htroot/env/templates/submenuSemantic.template, locales/de.lng, locales/fr.lng, locales/ja.lng, locales/master.lng.xlf, locales/ru.lng, source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/http/servlets/YaCyProxyServlet.java
Sat Jan 21 00:26:04 CET 2017
by reger
upd to solr-5.5.3
minor bugfix version
Changed Files: .classpath, build.xml, lib/lucene-analyzers-common-5.5.3.jar, lib/lucene-analyzers-phonetic-5.5.3.jar, lib/lucene-backward-codecs-5.5.3.jar, lib/lucene-classification-5.5.3.jar, lib/lucene-codecs-5.5.3.jar, lib/lucene-core-5.5.3.jar, lib/lucene-facet-5.5.3.jar, lib/lucene-grouping-5.5.3.jar, lib/lucene-highlighter-5.5.3.jar, lib/lucene-join-5.5.3.jar, lib/lucene-memory-5.5.3.jar, lib/lucene-misc-5.5.3.jar, lib/lucene-queries-5.5.3.jar, lib/lucene-queryparser-5.5.3.jar, lib/lucene-spatial-5.5.3.jar, lib/lucene-suggest-5.5.3.jar, lib/solr-core-5.5.3.jar, lib/solr-solrj-5.5.3.jar, pom.xml
Mon Jan 09 16:44:47 CET 2017
by luccioman
Cleaned up some Javadoc warnings.
Changed Files: source/net/yacy/cora/date/ISO8601Formatter.java, source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/data/list/ListAccumulator.java, source/net/yacy/data/list/XMLBlacklistImporter.java, source/net/yacy/data/ymark/YMarkUtil.java, source/net/yacy/document/AbstractParser.java, source/net/yacy/document/Document.java, source/net/yacy/document/LargeNumberCache.java, source/net/yacy/document/LibraryProvider.java, source/net/yacy/document/Parser.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/importer/Importer.java, source/net/yacy/document/importer/MediawikiImporter.java, source/net/yacy/document/importer/ResumptionToken.java, source/net/yacy/document/parser/apkParser.java, source/net/yacy/document/parser/docParser.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/Evaluation.java, source/net/yacy/document/parser/html/ImageEntry.java, source/net/yacy/document/parser/html/TransformerWriter.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/gui/framework/Switchboard.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/navigator/LanguageNavigator.java, source/net/yacy/search/navigator/Navigator.java, source/net/yacy/search/navigator/RestrictedStringNavigator.java, source/net/yacy/search/navigator/YearNavigator.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/snippet/TextSnippet.java
Wed Jan 04 17:09:37 CET 2017
by luccioman
Upgraded jgit build library to version 4.5.0

This is the latest Java 7 compatible jgit release.

Properly support GitHub tags marked as "Pre-release". 
With the previous venerable jgit version 1.1.0, a YaCy repository clone
having such a tag made GitRevTask and GitRevMavenTask crash.
Changed Files: build.xml, libbuild/GitRevMavenTask/pom.xml, libbuild/GitRevMavenTask/src/GitRevMavenTask.java, libbuild/GitRevTask/GitRevTask.java, libbuild/JavaEWAH-0.7.9.License, libbuild/JavaEWAH-0.7.9.jar, libbuild/httpclient-4.3.6.License, libbuild/httpclient-4.3.6.jar, libbuild/jsch-0.1.53.License, libbuild/jsch-0.1.53.jar, libbuild/org.eclipse.jgit-4.5.0.201609210915-r.License, libbuild/org.eclipse.jgit-4.5.0.201609210915-r.jar, libbuild/slf4j-api-1.7.2.License, libbuild/slf4j-api-1.7.2.jar, pom.xml


Bugfixes   
Jump to: YaCy Release current_development top / Other Changes

CommitDescription
Tue Aug 29 07:32:33 CEST 2017
by luccioman
Fixed Unresolved_Pattern occurence on results favicon HTML id.
Changed Files: htroot/yacysearchitem.java
Sun Jul 16 14:39:53 CEST 2017
by luccioman
Distinguish response parsing failures from unexpected exceptions.
Changed Files: source/net/yacy/crawler/retrieval/Response.java
Tue Jul 11 09:00:27 CEST 2017
by luccioman
Fixed read/copy on input streams reading sometimes less than expected.
Changed Files: source/net/yacy/kelondro/util/FileUtils.java, test/java/net/yacy/kelondro/util/FileUtilsTest.java
Sat Jul 08 22:46:15 CEST 2017
by reger
Fix unresolved pattern in api/share.html by init some display var's
Changed Files: htroot/api/share.java
Fri Jun 30 01:06:17 CEST 2017
by luccioman
Do not wrap unnecessarily loader IOExceptions in IOExceptions
Changed Files: source/net/yacy/repository/LoaderDispatcher.java
Thu Jun 08 07:19:16 CEST 2017
by luccioman
Properly close file output streams even on exceptions scenarios.
Changed Files: htroot/ConfigLanguage_p.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/cora/protocol/ftp/FTPClient.java, source/net/yacy/cora/storage/ZIPWriter.java, source/net/yacy/crawler/data/Transactions.java, source/net/yacy/data/Translator.java, source/net/yacy/document/content/dao/PhpBB3Dao.java, source/net/yacy/document/parser/apkParser.java, source/net/yacy/document/parser/bzipParser.java, source/net/yacy/document/parser/gzipParser.java, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/kelondro/blob/Gap.java, source/net/yacy/kelondro/blob/HeapWriter.java, source/net/yacy/kelondro/index/BinSearch.java, source/net/yacy/kelondro/index/RowHandleMap.java, source/net/yacy/kelondro/index/RowHandleSet.java, source/net/yacy/kelondro/util/XMLTables.java, source/net/yacy/peers/operation/yacyRelease.java, source/net/yacy/repository/Blacklist.java, source/net/yacy/search/AutoSearch.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/server/serverSwitch.java, source/net/yacy/utils/gzip.java, source/net/yacy/utils/tarTools.java, source/net/yacy/utils/translation/TranslatorXliff.java, source/net/yacy/visualization/AnimationGIF.java, source/net/yacy/visualization/AnimationPlotter.java, source/net/yacy/visualization/ChartPlotter.java, source/net/yacy/visualization/RasterPlotter.java
Tue May 30 12:32:14 CEST 2017
by luccioman
Fix unescape of URLs having some '%' chars but not percent-encoded
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/java/net/yacy/cora/document/id/MultiProtocolURLTest.java
Tue May 30 08:48:20 CEST 2017
by luccioman
Fixed scraper NullPointerException cases on malformed URLs.
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java
Thu May 18 00:28:12 CEST 2017
by Michael Peter Christen
enhanced debugging
Changed Files: source/net/yacy/search/schema/CollectionSchema.java
Tue May 09 12:15:41 CEST 2017
by luccioman
Fixed Debian install message misspelling.
Changed Files: debian/yacy.templates
Thu May 04 08:45:30 CEST 2017
by luccioman
Fixed the previously added link to scheduled dump operations.
Changed Files: htroot/IndexImportMediawiki_p.html
Mon May 01 11:44:26 CEST 2017
by Michael Peter Christen
copied fix from yacy_grid_parser for wrong array type
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java
Mon Apr 24 13:27:07 CEST 2017
by luccioman
Fixed "Unchecked conversion" compilation warnings.
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/FlatJSONResponseWriter.java, source/net/yacy/cora/util/JSONArray.java, source/net/yacy/cora/util/JSONObject.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/search/navigator/FileTypeNavigator.java, source/net/yacy/search/navigator/HostNavigator.java, source/net/yacy/search/navigator/StringNavigator.java, source/net/yacy/search/navigator/TokenizedStringNavigator.java, source/net/yacy/search/navigator/YearNavigator.java
Fri Apr 14 21:14:26 CEST 2017
by reger
fix unresolved_pattern on missing post parameter api/message.html
Changed Files: htroot/yacy/message.java
Thu Mar 30 15:41:14 CEST 2017
by luccioman
Fixed NPE case and API URL link on Solr HTML output for webgraph core.
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Tue Mar 07 12:27:27 CET 2017
by luccioman
Fixed settingsAck_p.html back link for case where referrer is stripped.
Changed Files: htroot/SettingsAck_p.java
Fri Mar 03 13:46:44 CET 2017
by luccioman
Fixed unresolved pattern case on /yacysearchlatestinfo.json api
Changed Files: htroot/yacysearchlatestinfo.java
Thu Feb 16 02:36:24 CET 2017
by reger
fix NPE in HTMLResponseWriter on missing document title
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Thu Feb 09 10:59:41 CET 2017
by luccioman
Fixed NPE case occurring when local solr index is disabled in search.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Jan 24 11:49:15 CET 2017
by luccioman
Index Browser : fixed display of "Count colors" for authorized users.
Changed Files: htroot/HostBrowser.java
Mon Jan 23 14:54:37 CET 2017
by luccioman
Fixed "-UNRESOLVED_PATTERN-" admin parameter in "load & index" links.
Changed Files: htroot/HostBrowser.java
Sat Jan 21 00:35:05 CET 2017
by reger
fix the missing solr-5.5.2.jar delete from prev. commit
Changed Files:
Mon Jan 09 17:59:01 CET 2017
by luccioman
Fixed 2 failing JUNit tests.
Changed Files: test/java/net/yacy/document/DateDetectionTest.java, test/java/net/yacy/utils/translation/TranslatorXliffTest.java
Mon Jan 09 09:57:53 CET 2017
by luccioman
Fixed some JavaDocs broken links.
Changed Files: source/net/yacy/cora/bayes/Classifier.java, source/net/yacy/data/list/ListAccumulator.java, source/net/yacy/search/SwitchboardConstants.java
Mon Jan 09 09:54:14 CET 2017
by luccioman
Fixed maven assembly base directory to match last main YaCy binaries.
Changed Files: assembly.xml


Other Changes   
Jump to: YaCy Release current_development top / Bugfixes

CommitDescription
Wed Aug 30 23:50:14 CEST 2017
by Michael Peter Christen
try to fix problem
with error description
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=6023&p=33889&sid=37bc7aa029422be571b9266cdef43c52#p33889
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Wed Aug 30 12:23:45 CEST 2017
by luccioman
Use local solr filtered results in total search results count.

This modification has indeed low incidence as eventual query modifiers
are already applied when requesting the local solr index. 
It mainly impact doublons detected with results from remote peers.

Also updated javadocs for clarification.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Aug 29 08:16:12 CEST 2017
by luccioman
Make result action links visible when focusing them with keyboard.
Changed Files: htroot/env/base.css
Tue Aug 29 07:39:12 CEST 2017
by luccioman
Removed duplicate HTML class attribute.
Changed Files: htroot/yacysearch.html
Mon Aug 28 19:03:51 CEST 2017
by luccioman
Added a button to manually refresh sorting of p2p search results.

As a server-side oriented alternative to the JavaScript realtime
resorting feature proposed in PR #104.
The goal is the same as in this PR : having the possibility compensate
the network latency of various peers results fetching and obtain once
possible a consistently ranked result set.
Changed Files: htroot/js/yacysearch.js, htroot/yacysearch.html, htroot/yacysearch.java, source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java, source/net/yacy/search/query/SearchEvent.java
Sun Aug 27 04:22:39 CEST 2017
by reger
update master.lng, RankingSolr_p.html text
Changed Files: locales/master.lng.xlf
Wed Aug 23 08:20:37 CEST 2017
by luccioman
Use Javadoc style comments on SearchEvent properties.

For better code readability and understanding.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Aug 22 14:13:00 CEST 2017
by luccioman
Added unit tests on the gzip parser.
Changed Files: source/net/yacy/document/parser/gzipParser.java, test/java/net/yacy/document/parser/gzipParserTest.java, test/parsertest/umlaute_html_utf8.html.gz, test/parsertest/umlaute_html_xml_txt_gnu.tgz, test/parsertest/umlaute_linux.txt.gz
Tue Aug 22 14:11:35 CEST 2017
by luccioman
Finer control on max links to parse in the html parser.
Changed Files: source/net/yacy/cora/storage/SizeLimitedMap.java, source/net/yacy/cora/storage/SizeLimitedSet.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/htmlParser.java, test/java/net/yacy/document/parser/htmlParserTest.java, test/parsertest/umlaute_html_namedentities.html
Tue Aug 22 14:06:09 CEST 2017
by luccioman
Added some unit tests on FileUtils.
Changed Files: test/java/net/yacy/kelondro/util/FileUtilsTest.java
Sun Aug 20 22:17:27 CEST 2017
by reger
Allow to stop currently running warc import (stop button) 
Changed Files: htroot/IndexImportWarc_p.html, htroot/IndexImportWarc_p.java, source/net/yacy/document/importer/WarcImporter.java
Wed Aug 16 14:21:07 CEST 2017
by luccioman
Use unredirected robots.txt URL when adding an entry to the table.
Changed Files: source/net/yacy/crawler/robots/RobotsTxt.java
Wed Aug 16 09:30:33 CEST 2017
by luccioman
Ensure proper synchronous robots entry retrieval on first check.

Previously, when checking for the first time the robots.txt policy on a
unknown host (not cached in the robots table), result was always empty
in the /getpageinfo_p.xml api and in the /CrawlCheck_p.html page. Next
calls returned however the correct information.
Changed Files: htroot/api/getpageinfo_p.java, source/net/yacy/crawler/robots/RobotsTxt.java
Tue Aug 15 21:04:36 CEST 2017
by luccioman
Upgraded Docker base image from deprecated java to openjdk.
Changed Files: docker/Dockerfile, docker/Dockerfile.alpine
Tue Aug 15 10:11:05 CEST 2017
by luccioman
Prevent search result failure on incomplete images information.

Complements the recent modification related to images in commit 7f395ef.

Unfortunately many documents metadata fetched from the freeworld p2p
network have only partial information about embedded images. Without
proper error handling, this made many searches in p2p mode to fail
completely.
Changed Files: htroot/yacysearchitem.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Tue Aug 15 07:16:01 CEST 2017
by Michael Peter Christen
added usage of X-Real-IP http header
to identify request IPs which came through NGINX reverse proxy
configurations
Changed Files: source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/http/servlets/SolrSelectServlet.java
Mon Aug 14 20:12:09 CEST 2017
by Michael Peter Christen
added image link in search results
This should be a help to make a preview of search results.
The image is computed from the list of embedded images, it is
always the first image in that list.
In rss-type results the image is presented like
<media:content medium="image" url="https://abc.xyz/logo.png"/>
as defined in
http://www.rssboard.org/media-rss#media-content
Changed Files: htroot/yacysearchitem.java, htroot/yacysearchitem.json, htroot/yacysearchitem.xml, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Mon Aug 14 14:47:01 CEST 2017
by luccioman
Also handle text content when parsing XML within limits.
Changed Files: source/net/yacy/document/parser/GenericXMLParser.java, test/java/net/yacy/document/parser/GenericXMLParserTest.java
Mon Aug 14 02:16:43 CEST 2017
by reger
Add junit test for AbstractOperations.addOperand()
Changed Files: test/java/net/yacy/cora/federate/solr/logic/AbstractOperationsTest.java
Mon Aug 14 01:03:15 CEST 2017
by reger
Correction of https://github.com/yacy/yacy_search_server/commit/d03e2c98ea6bd5701c8e8257174c439b9c006afb
Fix Conjunction.addOperator to do nothing if term is empty
prevent to result in query string with repeated logical operator
like "field:term AND AND field:term"
possibliy causing out of mem in postprocessing_doublecontent
Changed Files: source/net/yacy/cora/federate/solr/logic/AbstractOperations.java
Mon Aug 14 00:52:03 CEST 2017
by reger
Fix Conjunction.addOperator to do nothing if term is empty
prevent to result in query string with repeated logical operator
like "field:term AND AND field:term"
possibliy causing out of mem in postprocessing_doublecontent
Changed Files: source/net/yacy/cora/federate/solr/logic/AbstractOperations.java
Sat Aug 12 21:53:04 CEST 2017
by reger
Remove deprecated YaCyProxyServlet
was replaced by UrlProxyServlet
Changed Files: defaults/web.xml
Sat Aug 12 09:43:49 CEST 2017
by luccioman
Prevent unwanted cached bytes duplication on stream parsing.
Changed Files: source/net/yacy/document/TextParser.java
Sat Aug 12 09:42:06 CEST 2017
by luccioman
Updated xml parser limited parsing test for use latest jdk.
Changed Files: test/java/net/yacy/document/parser/GenericXMLParserTest.java
Fri Aug 11 20:34:59 CEST 2017
by luccioman
Updated debian package configuration to match new Java 1.8 target

Following migration from Java 1.7 to Java 1.8 in commit
6fe735945da97abcbb91ac545fb11cff9d48effc
Changed Files: debian/control
Thu Aug 10 23:57:37 CEST 2017
by reger
upde to icu4j-59_1.jar
Changed Files: .classpath, build.xml, lib/icu4j-59_1.jar, pom.xml
Sun Aug 06 23:41:53 CEST 2017
by reger
Skip public post of jre version.
Added to determine switch to java8  https://github.com/yacy/yacy_search_server/commit/596b5dfa5936b25b605c42807730c29a1d08cd15
Changed Files: htroot/Network.html, htroot/Network.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java
Sun Aug 06 23:26:27 CEST 2017
by reger
Replace deprecated ConcurrentHashSet with recommended Java8 
ConcurrentHashMap.newKeySet() in postprocessDocuments()
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sat Aug 05 23:47:27 CEST 2017
by reger
Harmonizing use of xml reader / sax parser in XMLBlacklistImporter
eliminating the need for lib/xercesImpl.jar
Changed Files: .classpath, build.xml, pom.xml, source/net/yacy/data/list/XMLBlacklistImporter.java
Sat Aug 05 22:30:06 CEST 2017
by reger
Patch last_modified date with internal FirstSeenTime() if no date provided
to make sure updated documents are indexed with their last-modified
date as provided in current crawl. 
(to patch moddate always with firstseen might bear the risk of miss 
actual updates).
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Tue Aug 01 00:59:53 CEST 2017
by reger
Remove obsolete Protocol parameter ttl (time to live) 
not interpreted in target yacy/query.html
also Protocol.querySeed() not used and parameter not interpreted in 
target servlet yacy/query.html
Changed Files: source/net/yacy/peers/Protocol.java
Mon Jul 31 23:38:10 CEST 2017
by reger
upd to poi-3.16.jar
Changed Files: .classpath, build.xml, lib/poi-3.16.License, lib/poi-3.16.jar, lib/poi-scratchpad-3.16.jar, pom.xml
Mon Jul 31 01:55:01 CEST 2017
by reger
Replace deprecated getIP with getIPs in Protocol transferURL() and 
getProfile().
Remember used ip for error handling and departInterface
Changed Files: source/net/yacy/peers/Protocol.java
Sun Jul 30 23:02:15 CEST 2017
by reger
Replace one more deprecated peerDeparture in Protocol.transferIndex() 
by moving/using interfaceDeparture() in transferRWI()
Changed Files: source/net/yacy/peers/Protocol.java
Sun Jul 30 20:09:06 CEST 2017
by reger
upd to pdfbox-2.0.7.jar
Changed Files: .classpath, build.xml, lib/fontbox-2.0.7.License, lib/fontbox-2.0.7.jar, lib/pdfbox-2.0.7.License, lib/pdfbox-2.0.7.jar, pom.xml
Sun Jul 23 03:55:56 CEST 2017
by reger
Add SolrConfig ClassicIndexSchemaFactory to prevent Solr startup warning.
This overrides Solr default to use managed schema. As we don't use
programatic schema changes this directs Solr to use schema.xml, eliminating
the warning.
Changed Files: defaults/solr/solrconfig.xml
Mon Jul 17 15:35:10 CEST 2017
by luccioman
Log an error when Solr folder migration fails for some reason.
Changed Files: source/net/yacy/search/index/Fulltext.java
Sun Jul 16 23:37:28 CEST 2017
by reger
upd to jwat-warc-1.1.0.jar
Changed Files: .classpath, build.xml, lib/jwat-archive-common-1.1.0.jar, lib/jwat-common-1.1.0.jar, lib/jwat-gzip-1.1.0.jar, lib/jwat-warc-1.1.0.jar, pom.xml
Sun Jul 16 23:35:56 CEST 2017
by reger
upd version for typeahead.jquery.js in jslicense.html
Changed Files: htroot/jslicense.html
Sun Jul 16 14:46:46 CEST 2017
by luccioman
Support parsing gzip files from servers with redundant headers.

Some web servers provide both 'Content-Encoding : "gzip"' and
'Content-Type : "application/x-gzip"' HTTP headers on their ".gz" files.
This was annoying to fail on such resources which are not so uncommon,
while non conforming (see RFC 7231 section 3.1.2.2 for
"Content-Encoding" header specification
https://tools.ietf.org/html/rfc7231#section-3.1.2.2)
Changed Files: source/net/yacy/crawler/retrieval/StreamResponse.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/gzipParser.java
Sun Jul 16 14:37:06 CEST 2017
by luccioman
URL Viewer : apply crawler size limits when adding to local index.

This allow large files parsing and preview, while preventing unwanted
OutOfMemory errors which are likely to occur when adding to the Solr
Index resources larger than configured crawler limits.
Changed Files: htroot/ViewFile.java
Sat Jul 15 00:19:23 CEST 2017
by reger
Clean up unmaintained and unused AugmentParser trail.
Changed Files:
Fri Jul 14 23:41:39 CEST 2017
by reger
Clean up redundant but obsolete jquery.rdfquery-core-1.0.js script lib
Changed Files: htroot/jslicense.html
Thu Jul 13 08:18:40 CEST 2017
by luccioman
Added gzip parser support for max content bytes limit
Changed Files: source/net/yacy/document/parser/gzipParser.java
Thu Jul 13 08:12:10 CEST 2017
by luccioman
Added HTML parser support for maximum content bytes parsing limit 
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/htmlParser.java
Wed Jul 12 16:03:23 CEST 2017
by luccioman
Merge pull request #122 from Scarfmonster/patch-1

I also reproduced the issue, and the fix is working fine.

Thanks @Scarfmonster 
Changed Files: source/net/yacy/http/Jetty9HttpServerImpl.java
Wed Jul 12 00:18:12 CEST 2017
by luccioman
Added RSS parser support for maximum content bytes parsing limit
Changed Files: source/net/yacy/cora/document/feed/RSSFeed.java, source/net/yacy/cora/document/feed/RSSReader.java, source/net/yacy/document/Document.java, source/net/yacy/document/parser/rssParser.java
Wed Jul 12 00:13:24 CEST 2017
by luccioman
Finer control on bounded input streams with custom stream implementation
Changed Files: source/net/yacy/cora/util/StreamLimitException.java, source/net/yacy/cora/util/StrictLimitInputStream.java, source/net/yacy/crawler/retrieval/FileLoader.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/GenericXMLParser.java
Tue Jul 11 09:07:48 CEST 2017
by luccioman
Added parsing within bounds implementation to the generic parser.
Changed Files: source/net/yacy/document/parser/genericParser.java
Tue Jul 11 09:06:37 CEST 2017
by luccioman
Support trying multiple parsers even when streaming on large resources.
Changed Files: source/net/yacy/document/TextParser.java
Tue Jul 11 09:04:23 CEST 2017
by luccioman
Support loading local files with a per request specified maximum size.

Consistently with the HTTP loader implementation.
Changed Files: source/net/yacy/crawler/retrieval/FileLoader.java, source/net/yacy/repository/LoaderDispatcher.java
Sun Jul 09 23:08:54 CEST 2017
by reger
Fix css conflict of YMarks.html to make it viewable.
yacy-ymarks.css sidebar conflicts with bootstraps sidebar (different
overlay settings). Simply renamed it to ymark-sidebar.
Changed Files: htroot/YMarks.html, htroot/env/yacy-ymarks.css
Sat Jul 08 23:46:10 CEST 2017
by reger
upd to commons-fileupload-1.3.3.jar
Changed Files: .classpath, build.xml, lib/commons-fileupload-1.3.3.License, lib/commons-fileupload-1.3.3.jar, pom.xml
Mon Jul 03 14:53:36 CEST 2017
by luccioman
Removed temporary html parser test code
Changed Files: test/java/net/yacy/document/parser/htmlParserTest.java
Mon Jul 03 13:51:14 CEST 2017
by luccioman
URL Viewer : decode raw text using the eventual response charset.

When provided, or decode as UTF-8 as previously done.
Changed Files: htroot/ViewFile.java
Mon Jul 03 10:00:53 CEST 2017
by luccioman
HTML parser : removed unnecessary remaining recursive processing

Recursive processing was removed in commit
67beef657f82e92f48dd8425073ad81896a2ff4b, but one remained for anchors
content(likely omitted from refactoring). It is no more necessary :
other links such as images embedded in anchors are currently correctly
detected by the parser.

More annoying : that remaining recursive processing could lead to almost
endless processing when encountering some (invalid) HTML structures
involving nested anchors, as detected and reported by lucipher on YaCy
forum ( http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6005 ).
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, test/java/net/yacy/document/parser/htmlParserTest.java
Fri Jun 30 11:41:48 CEST 2017
by luccioman
Updated PerformanceQueues_p.xml API with last related servlet changes
Changed Files: htroot/PerformanceQueues_p.xml
Fri Jun 30 11:30:54 CEST 2017
by luccioman
Made remote search max system load limits configurable from UI.

As reported by davide on YaCy forums (
http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6004 ) when the
system is on high load, unless reading carefully YaCy configuration
file, it could be difficult to understand why remote search results are
not fetched.
Changed Files: htroot/PerformanceQueues_p.html, htroot/PerformanceQueues_p.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/SwitchboardConstants.java
Fri Jun 30 02:11:18 CEST 2017
by reger
Add keyword constraint to rwi query result filter
To discard rwi results not matching query keyword: parameter 
Changed Files: source/net/yacy/search/query/SearchEvent.java
Fri Jun 30 01:13:47 CEST 2017
by luccioman
Apply consistent behavior on HTTP resource size exceeding limit.

On content size known from HTTP headers, terminates connection faster
and improves error reports quality by reporting relevant message
"Content to download exceed maximum value..." rather than previously "no
response (NULL) for url...".
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Fri Jun 30 00:30:54 CEST 2017
by luccioman
Respect maxFileSize limit also when streaming HTTP and when relevant.

Constraint applied consistently with HTTP content full load in byte
array.
Changed Files: source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/visualization/ImageViewer.java
Thu Jun 29 11:36:47 CEST 2017
by luccioman
Added an informative title on the crawl start robots.txt status icon
Changed Files: htroot/js/IndexCreate.js
Thu Jun 29 11:25:27 CEST 2017
by luccioman
Crawl start Ajax request : properly handle eventual XML parsing errors

Otherwise on a malformed getpageinfo_p XML response (from the browser
point of view), JavaScript errors where thrown and the ajax status
steering wheel remained displayed indefinitely.
Changed Files: htroot/js/IndexCreate.js
Tue Jun 27 19:30:40 CEST 2017
by luccioman
Refactored plain-text URLs detection implementation.

For faster processing (measured about 2 times faster on many real-world
examples) and more advanced detection (previous algorithm detected only
URLs separated from the rest of the text by a space character).
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, test/java/net/yacy/document/parser/html/ContentScraperTest.java
Mon Jun 26 17:33:56 CEST 2017
by luccioman
Made mime type and extension normalization locale independent.

Previously, upper cased mime type was incorrectly normalized when the
default locale is Turkish.
Changed Files: source/net/yacy/document/TextParser.java, test/java/net/yacy/document/TextParserTest.java
Sun Jun 25 20:05:37 CEST 2017
by reger
upd to jwat-warc-1.0.6.jar
Changed Files: .classpath, build.xml, lib/jwat-archive-common-1.0.6.jar, lib/jwat-common-1.0.6.jar, lib/jwat-gzip-1.0.6.jar, lib/jwat-warc-1.0.6.jar, pom.xml
Sat Jun 24 23:15:25 CEST 2017
by reger
remove unused Solr optional extra handler lib solr-dataimporthandler-6.6.0.jar
Changed Files: .classpath, build.xml
Sat Jun 24 22:54:43 CEST 2017
by reger
upd to jsoup-1.10.3.jar
Changed Files: .classpath, build.xml, lib/jsoup-1.10.3.jar, pom.xml
Fri Jun 23 02:23:49 CEST 2017
by Ryszard Go?
Wrong password was removed after the SSL certificate import

Removing the keystore password will prevent ssl from working after the next restart. The certificate password should be removed instead.
Fixes http://mantis.tokeek.de/view.php?id=687
Changed Files: source/net/yacy/http/Jetty9HttpServerImpl.java
Thu Jun 22 10:50:34 CEST 2017
by luccioman
Improved character encoding detection from Content-Type header

Also updated some related JavaDocs
Changed Files: source/net/yacy/cora/protocol/HeaderFramework.java, test/java/net/yacy/cora/protocol/HeaderFrameworkTest.java
Wed Jun 21 09:14:50 CEST 2017
by luccioman
Added a basic JUnit test with test gz files for the gzip parser
Changed Files: test/java/net/yacy/document/parser/gzipParserTest.java, test/parsertest/umlaute_html_utf8.html.gz, test/parsertest/umlaute_linux.txt.gz
Wed Jun 21 09:11:17 CEST 2017
by luccioman
Properly close test files in htmlParser unit test
Changed Files: test/java/net/yacy/document/parser/htmlParserTest.java
Mon Jun 19 17:02:11 CEST 2017
by luccioman
Prevent integer overflow in table statistics and use strong typing
Changed Files: htroot/PerformanceMemory_p.java, source/net/yacy/kelondro/table/Table.java
Sat Jun 17 09:33:14 CEST 2017
by luccioman
Limit the number of initially previewed links in crawl start pages.

This prevent rendering a big and inconvenient scrollbar on resources
containing many links.
If really needed, preview of all links is still available with a "Show
all links" button.

Doesn't affect the number of links used once the crawl is effectively
started, as the list is then loaded again server-side.
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartSite.html, htroot/api/getpageinfo_p.java, htroot/api/getpageinfo_p.xml, htroot/js/IndexCreate.js
Sat Jun 17 09:26:37 CEST 2017
by luccioman
Improved stream-oriented parsing entering conditions.
Changed Files: source/net/yacy/document/TextParser.java
Fri Jun 16 08:50:57 CEST 2017
by luccioman
Limit scope of some local JavaScript variables.
Changed Files: htroot/js/IndexCreate.js
Fri Jun 16 08:44:40 CEST 2017
by Michael Peter Christen
added json(p) endpoint for crawl start
Changed Files: htroot/Crawler_p.java, htroot/Crawler_p.json
Fri Jun 16 06:31:45 CEST 2017
by reger
make nsis build script require java 8
Changed Files: build.nsi
Fri Jun 16 02:17:49 CEST 2017
by reger
update nsi installer java autodl bundleid to use jre-8u131
Changed Files: build.nsi
Fri Jun 16 00:12:09 CEST 2017
by reger
remove reference to velocityresponsewriter in solrconfig.xml 
it is not longer part of solr-core api
http://lucene.apache.org/solr/6_6_0/index.html
Changed Files: defaults/solr/solrconfig.xml
Thu Jun 15 21:02:18 CEST 2017
by reger
remove sample path setting in solrconfig.xml not valid in Yacy
resulting in startup stop exception after fresh swithch to 1.921
Changed Files: defaults/solr/solrconfig.xml
Thu Jun 15 20:24:53 CEST 2017
by reger
update maven pom setting to YaCy version 1.921 
java 1.8 and solr 6.6
Changed Files: pom.xml
Thu Jun 15 14:13:46 CEST 2017
by luccioman
Prevent high CPU load at startup, caused by the Solr suggester build.

Reported by Collision on mantis 758 (
http://mantis.tokeek.de/view.php?id=758 ).
Introduced by the new YaCy Solr configuration for Solr 6.6.0 (see commit
6fe735945da97abcbb91ac545fb11cff9d48effc), including now Suggester
configuration.
Changed Files: defaults/solr/solrconfig.xml
Thu Jun 15 09:50:02 CEST 2017
by luccioman
Added HT Cache basic statistics (hit rate)
Changed Files: htroot/ConfigHTCache_p.html, htroot/ConfigHTCache_p.java, source/net/yacy/crawler/data/Cache.java, test/java/net/yacy/crawler/data/CacheTest.java
Thu Jun 15 09:48:22 CEST 2017
by luccioman
Use volatile to ensure concurrent threads use up to date property value
Changed Files: source/net/yacy/kelondro/blob/Compressor.java
Wed Jun 14 19:02:08 CEST 2017
by luccioman
Made Cache compression level and lock timeout user configurable
Changed Files: defaults/yacy.init, htroot/ConfigHTCache_p.html, htroot/ConfigHTCache_p.java, source/net/yacy/crawler/data/Cache.java, source/net/yacy/kelondro/blob/Compressor.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, test/java/net/yacy/crawler/data/CacheTest.java
Wed Jun 14 08:56:11 CEST 2017
by luccioman
Prevent log pollution from unwanted Solr warnings.

Many non-blocking "java.nio.file.NoSuchFileException" traces with
warning log level can be logged by Solr, especially when heavily
crawling. This is issue is known from Solr 5.x but still unresolved with
Solr 6.x ( https://issues.apache.org/jira/browse/SOLR-9120 )

Consequently upgraded to "SEVERE" the default log level of the related
internal Solr class.

See also mantis 727 ( http://mantis.tokeek.de/view.php?id=727 )
Changed Files: defaults/yacy.logging
Fri Jun 09 12:50:36 CEST 2017
by Michael Peter Christen
re-added solr synchronization hack
Changed Files: source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java
Thu Jun 08 07:36:11 CEST 2017
by luccioman
Ensure system resource release by closing document stream.
Changed Files: source/net/yacy/document/TextParser.java
Tue Jun 06 10:30:02 CEST 2017
by luccioman
Removed unnecessary finalize implementation.

On such private classes with limited scope but with frequent instance
creations and removals within the application lifecycle, implementing
the finalize method is particularly unwanted as it decreases the garbage
collector performance.
What's more the Object.finalize() method is now deprecated in the JDK 9
and will eventually disappear from future releases (see
https://bugs.openjdk.java.net/browse/JDK-8177970)
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Sun Jun 04 01:50:40 CEST 2017
by reger
Tokenize result entry keywords and add some styling for display
Changed Files: htroot/env/base.css, htroot/yacysearchitem.html, htroot/yacysearchitem.java
Sat Jun 03 21:58:04 CEST 2017
by reger
upd to commons-compress-1.14.jar
Changed Files: .classpath, build.xml, lib/commons-compress-1.14.License, lib/commons-compress-1.14.jar, pom.xml
Fri Jun 02 09:47:45 CEST 2017
by luccioman
Unsure closing ChunkIterator stream in every possible use case.

Also trace in logs the eventual close failures instead of failing
silently.
This should help prevent holding too many unreleased system file
handlers, as in the case reported by eros on YaCy forum
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5988&sid=b00e7486c1bf7e48a0d63eb328ccca02
)
Changed Files: source/net/yacy/kelondro/table/ChunkIterator.java, source/net/yacy/kelondro/table/Table.java
Fri Jun 02 01:46:06 CEST 2017
by luccioman
Improved consistency between loader openInputStream and load functions
Changed Files: source/net/yacy/crawler/retrieval/FTPLoader.java, source/net/yacy/crawler/retrieval/FileLoader.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/crawler/retrieval/SMBLoader.java, source/net/yacy/crawler/retrieval/StreamResponse.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/visualization/ImageViewer.java
Tue May 30 17:38:16 CEST 2017
by luccioman
Added JavaDoc to the getpageinfo_p API servlet.
Changed Files: htroot/api/getpageinfo_p.java
Tue May 30 09:29:28 CEST 2017
by luccioman
Deprecated duplicated and internally unused getpageinfo servlet.

Redirections set for the transition of any eventual external uses:
 - /api/getpageinfo.xml to /api/getpageinfo_p.xml
 - /api/getpageinfo.json to /api/getpageinfo_p.json
Changed Files: htroot/api/getpageinfo.java, htroot/api/getpageinfo_p.json
Mon May 29 19:16:09 CEST 2017
by luccioman
Fixed a NullPointerException case on Digest authentication.

Could occur when upgrading from a Debian package configured with Basic
authentication (as in release 1.92.9000) to a more recent one with
Digest authentication, without having re-encoded the admin password (for
example with dpkg-reconfigure).

As reported by eros on YaCy forum
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5988#p33686).
Changed Files: source/net/yacy/http/YaCyLegacyCredential.java
Wed May 24 22:13:42 CEST 2017
by reger
upd to pdfbox-2.0.6.jar
Changed Files: .classpath, build.xml, lib/fontbox-2.0.6.License, lib/fontbox-2.0.6.jar, lib/pdfbox-2.0.6.License, lib/pdfbox-2.0.6.jar, pom.xml
Wed May 24 08:43:03 CEST 2017
by luccioman
Quoted param value in Solr query to avoid unwanted traces in logs

When Webgraph Solr core is enabled, crawling and removing from index an
URL whose hash starts with the '-' character (example URL :
https://cs.wikipedia.org/ whose hash is "-2-HuTEndn4x") produced a full
ParseException stack trace in YaCy logs. This was not blocking because
the Solr query parser is able to escape itself the query and run it
successfully, but filled uselessly YaCy logs.
Changed Files: source/net/yacy/search/index/Fulltext.java
Tue May 23 07:25:40 CEST 2017
by luccioman
Restored search page default behavior for Tab, Page Up and Down keys

Replaced by shortcuts defined by the HTML "accesskey" attribute which
has the advantage to be advertised by screen readers when focusing the
corresponding buttons, contrary to custom JavasScript key handlers.
Now With Firefox :
 - "Alt + Shift + n" for next page
 - "Alt + Shift + p" for previous page

Following ARIA recommendation : "keyboard shortcuts enhance, not
replace, standard keyboard access." ( see
https://www.w3.org/TR/wai-aria-practices/#kbd_shortcuts_behavior_design)

Fix for mantis 711 (http://mantis.tokeek.de/view.php?id=711)
Changed Files: htroot/js/yacysearch.js, htroot/yacysearch.html
Mon May 22 01:56:11 CEST 2017
by reger
Set request originator to own peer in warc importer
in addition to change in https://github.com/yacy/yacy_search_server/commit/039162fbf0eca808afd350d360c3bcfe62dc4195
Changed Files: source/net/yacy/document/importer/WarcImporter.java
Mon May 22 01:34:08 CEST 2017
by reger
Change warc importer to use defaultsurrogate-crawl profile, as reported
by LA_FORGE http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5990 and
analysed by @luccioman (see comment https://github.com/yacy/yacy_search_server/commit/510f11d3745e14841420781376b733fd248d51f3)
it creates conflict using a other crawlprofile without setting originator.
Changed Files: source/net/yacy/document/importer/WarcImporter.java
Thu May 18 00:28:00 CEST 2017
by Michael Peter Christen
added a cache to prevent too many seed enumerations
Changed Files: source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java
Wed May 17 09:00:29 CEST 2017
by luccioman
Enable p2p and cluster communication when "Protection of all pages" on

As reported by paul89 on YaCy forum
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5958 ), when setting
the "Protection of all pages" to "On" in the "ConfigAccounts_p.html"
page, the peer became completely unreachable by others, which is not the
purpose of this feature.
But the restriction still makes sense as a security enforcement and is
maintained in private "Robinson mode" where by the way any peer-to-peer
or cluster communication would be rejected.
Changed Files: source/net/yacy/http/Jetty9YaCySecurityHandler.java
Tue May 16 09:44:13 CEST 2017
by luccioman
Added missing accessibility attributes on search results progress bar.
Changed Files: htroot/js/yacysearch.js, htroot/yacysearch.html
Mon May 15 13:31:24 CEST 2017
by luccioman
Annotated search result information separators for screen readers.
Changed Files: htroot/ConfigSearchPage_p.html, htroot/yacysearchitem.html
Sat May 13 20:38:25 CEST 2017
by sgaebel
added closing of lst-Tag in solr-Export
Changed Files: source/net/yacy/search/index/Fulltext.java
Thu May 11 08:33:19 CEST 2017
by luccioman
Added some JavaDoc
Changed Files: source/net/yacy/peers/RemoteSearch.java
Tue May 09 22:52:54 CEST 2017
by reger
Adjust mergeDocuments to keep youngest last-modified date of document
collection
Changed Files: source/net/yacy/document/Document.java, test/java/net/yacy/document/DocumentTest.java
Tue May 09 18:32:47 CEST 2017
by luccioman
Fixed StringIndexOutOfBoundsException case.

Revealed by commit c77e43a : the exception was then thrown when indexing
pages containing mailto: scheme URL links with the Solr Webgraph core
enabled.
Fixed the error case and restored filtering on mailto links in
Document.resortLinks() as these URLs still should not appear in
Document.hyperlinks.
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/search/schema/WebgraphConfiguration.java
Tue May 09 12:20:41 CEST 2017
by luccioman
Updated Debian package post install script admin password encoding.

To fit the now default HTTP authentication method set to Digest in
commit f7fce1b.
Also fixed unauthenticated access from localhost setting when first
installing the Debian package and letting the prompted password field
empty.
Changed Files: debian/postinst
Thu May 04 16:36:45 CEST 2017
by luccioman
Improved new blacklist entries URL scheme detection.
Changed Files: source/net/yacy/repository/BlacklistHelper.java, test/java/net/yacy/repository/BlacklistHelperTest.java
Thu May 04 11:21:27 CEST 2017
by luccioman
Updated putHTML() JavaDoc
Changed Files: source/net/yacy/server/serverObjects.java
Thu May 04 11:19:59 CEST 2017
by luccioman
Handle '?' and '+' chars as valid wild cards when adding to blacklist.

An entry such as "domain.com/[a-z]+" is a valid regular expression and
do not need additional ".*.*/.*" wildcards.
Changed Files: source/net/yacy/repository/BlacklistHelper.java
Thu May 04 11:12:58 CEST 2017
by luccioman
Fixed blacklist Regex containing '+' characters rendering.

As reported on YaCy forum by shni
(http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5970) when a
blacklist entry contained both '?' and '+' characters, the '+' chars
were wrongly decoded and rendered as spaces.
Changed Files: htroot/Blacklist_p.java
Wed May 03 18:53:01 CEST 2017
by luccioman
Added MediaWiki dump import scheduling feature.

Checking the last modified date by default to prevent unnecessary long
running operations.
Changed Files: htroot/IndexImportMediawiki_p.html, htroot/IndexImportMediawiki_p.java, source/net/yacy/data/WorkTables.java
Tue May 02 09:38:45 CEST 2017
by luccioman
Improved MediaWiki dump import monitoring.

When import thread is terminated :
 - now stop refreshing and stay on the monitoring page to give user a
feedback after a long running import
 - added link to the next monitoring step : results from surrogates
reader
 - added link to new import
 
On the new import page, added a link on the eventual last import report.
Changed Files: htroot/IndexImportMediawiki_p.html, htroot/IndexImportMediawiki_p.java
Tue May 02 09:33:11 CEST 2017
by luccioman
Added some JavaDoc
Changed Files: source/net/yacy/document/importer/Importer.java
Tue May 02 09:32:04 CEST 2017
by luccioman
Fixed regression introduced by commit 9ad4d16

On MediaWiki dump imports, the SurrogateReader was trying to unread too
many bytes, then failing with the following exception :
"java.io.IOException: Push back buffer is full".
Changed Files: source/net/yacy/document/content/SurrogateReader.java
Mon May 01 11:38:02 CEST 2017
by Michael Peter Christen
added patch to rewrite altered yacy grid schema into yacy schema

This generates the stub and protocol parts of an url for inboundlinks,
outboundlinks and images
Changed Files: source/net/yacy/search/Switchboard.java
Sun Apr 30 23:53:52 CEST 2017
by reger
Add a responsHeader to the solr index export with a format identifier
and export parameter (in accordance with response xml format) for easier
format detection on import.
Changed Files: source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/search/index/Fulltext.java
Fri Apr 28 11:39:51 CEST 2017
by luccioman
Fixed Index Export feature for compatibility with old indexed documents.

This is a fix for mantis 682 (http://mantis.tokeek.de/view.php?id=682)
and issue #116
Changed Files: source/net/yacy/search/index/Fulltext.java
Fri Apr 28 11:36:48 CEST 2017
by luccioman
Added some JavaDoc
Changed Files: source/net/yacy/cora/federate/solr/SchemaDeclaration.java
Thu Apr 27 18:24:54 CEST 2017
by luccioman
Crawl results page : apply table lines number limit.

Take into account the already existing default limit value (especially
useful after a long crawl or surrogates import), or a custom one from
parameter "count".
Added a "Show all" link for convenience.
Changed Files: htroot/CrawlResults.html, htroot/CrawlResults.java
Thu Apr 27 09:50:04 CEST 2017
by luccioman
Extended WikiCode template inclusion syntax support.

Wiki templates are not rendered but syntax support is improved, which
greatly enhance snippets rendering on search results coming from a
MediaWiki dump import.
Tested on various dumps from Wikimedia at
https://dumps.wikimedia.org/backup-index.html
See also Wikipedia transclusion documentation at
https://en.wikipedia.org/wiki/Wikipedia:Transclusion
Changed Files: source/net/yacy/data/wiki/WikiCode.java, test/java/net/yacy/data/wiki/WikiCodeTest.java
Tue Apr 25 08:44:02 CEST 2017
by Michael Peter Christen
added yacy grid flatjson surrogate parser
Changed Files: source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionSchema.java
Mon Apr 24 18:24:26 CEST 2017
by luccioman
Fixed surrogates import monitoring page (/CrawlResults.html?process=7)

This page was always empty, as described in mantis 740
(http://mantis.tokeek.de/view.php?id=740)
Changed Files: source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/search/Switchboard.java
Sat Apr 22 23:32:40 CEST 2017
by reger
upd to jwat-1.0.5
Changed Files: .classpath, build.xml, lib/jwat-archive-common-1.0.5.jar, lib/jwat-common-1.0.5.jar, lib/jwat-gzip-1.0.5.jar, lib/jwat-warc-1.0.5.jar, pom.xml
Thu Apr 20 00:47:52 CEST 2017
by reger
fix unit test MultiProtocolURL(file) assertion for Windows path with
drive letter.
Changed Files: test/java/net/yacy/cora/document/id/MultiProtocolURLTest.java
Thu Apr 20 00:18:18 CEST 2017
by reger
Take out mailto collect in internal parsed document
As earlier plans to make use of mailto as separate webgraph entity didn't
materialize (see  http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5726&p=32493&hilit=mailto#p32493)
free the unused handling and resources.
Changed Files: htroot/ViewFile.java, source/net/yacy/document/Document.java
Sun Apr 16 04:25:29 CEST 2017
by reger
Add url input field as source for WarcImporter
allowing to import warc from url without prior download.
Changed Files: htroot/IndexImportWarc_p.html, htroot/IndexImportWarc_p.java, source/net/yacy/document/importer/WarcImporter.java
Fri Apr 14 14:23:50 CEST 2017
by luccioman
Improved http client close time on stream processing errors.
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Wed Apr 12 17:17:03 CEST 2017
by luccioman
Fixed endless loop case in wikicode processing.

Detected when importing recent MediaWiki dumps containing some pages
with script content in plain text format (see Scribunto extension
https://www.mediawiki.org/wiki/Extension:Scribunto ).

Further improvement : modify the MediawikiImporter to prevent processing
revisions whose <model> is not wikitext.
Changed Files: source/net/yacy/data/wiki/WikiCode.java, test/java/net/yacy/data/wiki/WikiCodeTest.java
Wed Apr 12 09:23:10 CEST 2017
by luccioman
Improved support for non ASCII chars in local file system URLs

Creating a MultiProtocolURL instance from a File object and then
retrieving a File with getFSFile() was inconsistent with file paths
containing space or non ASCII chars. 
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/java/net/yacy/cora/document/id/MultiProtocolURLTest.java
Tue Apr 11 08:21:34 CEST 2017
by luccioman
Improved error reports on various wiki dump prerequisites failure cases.

Also added some JavaDoc.
Changed Files: htroot/IndexImportMediawiki_p.html, htroot/IndexImportMediawiki_p.java
Tue Apr 11 07:34:17 CEST 2017
by luccioman
Used a text input for wiki dump import file selection.

Using an HTML "file" input was confusing (as reported by promocore on
YaCy forum : http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5965) ,
and it only worked with MS IE/Edge on a local YaCy peer :
 - for security reasons some current major browsers such as Firefox or
Chrome do not allow to send full file path information when using a file
form input
 - the local file system selection popup doesn't make sense when you
want to import a dump on a remote YaCy server
Changed Files: htroot/IndexImportMediawiki_p.html
Mon Apr 10 22:58:20 CEST 2017
by reger
Adjust ConfigSearchPage_p to activated hosts navigator as plugin
Changed Files: htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java
Mon Apr 10 22:42:06 CEST 2017
by reger
Activate hosts navigator plugin. This includes rwi results in the navigator
count.
This might be tangential related to http://mantis.tokeek.de/view.php?id=736
as the example includes a local index search, while rwi results are not
counted.
Changed Files: htroot/yacysearchtrailer.html, htroot/yacysearchtrailer.java, htroot/yacysearchtrailer.json, htroot/yacysearchtrailer.xml, source/net/yacy/search/navigator/NavigatorPlugins.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/SearchEvent.java
Sun Apr 09 21:42:05 CEST 2017
by reger
add missing text from ConfigRobotsTxt_p to master.lng
and link to Translation Editor to Translation News page.
Changed Files: htroot/TransNews_p.html, locales/master.lng.xlf
Sun Apr 09 02:09:32 CEST 2017
by reger
add servlet to list user in UserDB and made user editor available in
separate servlet for a quick and easy overview of configured user and
selection for edit.
Changed Files: htroot/ConfigAccountList_p.html, htroot/ConfigAccountList_p.java, htroot/ConfigAccounts_p.html, htroot/ConfigAccounts_p.java, htroot/ConfigUser_p.html, htroot/ConfigUser_p.java
Sat Apr 08 22:54:57 CEST 2017
by reger
fix edit current user form to required post mehtod 
introduced with https://github.com/yacy/yacy_search_server/commit/cde237b68763c542da20038e5f62bea341ae1d37
Changed Files: htroot/ConfigAccounts_p.html, htroot/ConfigAccounts_p.java
Fri Apr 07 09:15:05 CEST 2017
by Michael Peter Christen
added flatjson parser (stub, unfinished)
Changed Files: source/net/yacy/search/Switchboard.java
Wed Apr 05 00:08:25 CEST 2017
by reger
Introduce a Keyword search navigator using the index field keywords.
The keywords field string is split into words as navigator entries.

A keyword navigator facet is essential for search appliance usage were
documents and metadata use often specialized keyword vocabularies to 
filter search results. This navi can be used without custom index schema.

As we don't have defined a search query command to filter "keywords" yet,
the filtering is limited by adding the keyword to the search query.
Changed Files: source/net/yacy/search/navigator/NavigatorPlugins.java, source/net/yacy/search/navigator/TokenizedStringNavigator.java
Mon Apr 03 22:53:07 CEST 2017
by reger
add CookieTest_p.html text to master.lng
Changed Files: locales/master.lng.xlf
Mon Apr 03 12:20:16 CEST 2017
by luccioman
Enforced access controls on a few more administration pages.

 - ensure use of HTTP POST method when performing server side effect
operations
 - transaction token required to ensure the request has effectively been
requested by user interaction
Changed Files: htroot/ConfigPortal_p.html, htroot/ConfigPortal_p.java, htroot/Table_API_p.html, htroot/Table_API_p.java, htroot/Translator_p.html, htroot/Translator_p.java
Mon Apr 03 11:40:37 CEST 2017
by luccioman
Escaped HTML eventually active content from recorded API call comments.
Changed Files: htroot/Table_API_p.java
Sun Apr 02 22:30:23 CEST 2017
by reger
update master.lng with recent text changes 
to IndexExport_p.html, IndexImportWarc_p.html
Changed Files: locales/master.lng.xlf
Sun Apr 02 20:36:22 CEST 2017
by reger
use css error class for error msg in IndexImportOAIPMH_p.html,
adjust to xhtml <p> usage rule
Changed Files: htroot/IndexImportOAIPMH_p.html
Sun Apr 02 03:59:37 CEST 2017
by reger
remove test case for Standard_MemoryControl which will always fail
see https://github.com/yacy/yacy_search_server/pull/114
Changed Files:
Sun Apr 02 03:32:21 CEST 2017
by reger
Add servlet to import warc file from filesystem IndexImportWarc_p.html.
Apply Importer interface to WarcImporter
Changed Files: htroot/IndexImportWarc_p.html, htroot/IndexImportWarc_p.java, htroot/env/templates/submenuIndexImport.template, source/net/yacy/document/importer/WarcImporter.java, source/net/yacy/search/Switchboard.java
Sat Apr 01 01:04:17 CEST 2017
by Michael Peter Christen
added export to elasticsearch. The export dump can easily be imported to
elasticsearch using the command
curl -XPOST localhost:9200/collection1/yacy/_bulk --data-binary
@yacy_dump_XXX.flatjson
Changed Files: htroot/IndexExport_p.html, htroot/IndexExport_p.java, source/net/yacy/cora/federate/solr/responsewriter/FlatJSONResponseWriter.java, source/net/yacy/search/index/Fulltext.java
Thu Mar 30 16:14:22 CEST 2017
by luccioman
URL Viewer : only display the link to metadata when metadata exists
Changed Files: htroot/ViewFile.html, htroot/ViewFile.java
Thu Mar 30 10:23:47 CEST 2017
by luccioman
Modified RWI settings page radio click event to use HTTP POST
Changed Files: htroot/IndexControlRWIs_p.html, locales/de.lng, locales/master.lng.xlf, locales/ru.lng, locales/uk.lng
Thu Mar 30 09:22:28 CEST 2017
by luccioman
Updated API calls recording/replay with recent changes.

 - enabled HTTP POST calls with Digest HTTP authentication
 - made API calls compatible with API newly restricted to HTTP POST only
with transaction token validation
 - ensured backward compatibility with older entries recorded as HTTP
GET
Changed Files: htroot/CrawlStartScanner_p.java, source/net/yacy/data/WorkTables.java
Sun Mar 26 23:52:31 CEST 2017
by reger
fix default/httpd.mime Z file extension to lower case
+ test case
Changed Files: defaults/httpd.mime, test/java/net/yacy/cora/document/analysis/ClassificationTest.java
Sun Mar 26 23:26:40 CEST 2017
by reger
remove seedlist bootstrap target (not working for some longer time)
Changed Files: defaults/yacy.network.freeworld.unit
Sun Mar 26 23:13:12 CEST 2017
by reger
Add label text for search word statistic (AccessTracker_p.html) to master
lng file
Changed Files: locales/master.lng.xlf
Sun Mar 26 20:05:48 CEST 2017
by reger
One more use of SwitchboardConstants.SERVER_PORT constant,
apply standard servlet design pattern initialization of solrselectservlet 
Changed Files: source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Sun Mar 26 11:29:04 CEST 2017
by luccioman
Extended Apache HTTP Digest Auth. for use of YaCy encoded password

When programmatically requesting the local peer with Apache http client,
authentication credentials must be passed as clear-text values. 
This extension to the apache org.apache.http.impl.auth.DigestScheme
permits use of the YaCy encoded password stored in the
adminAccountBase64MD5 configuration property.
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/cora/protocol/http/auth/HttpEntityDigester.java, source/net/yacy/cora/protocol/http/auth/YaCyDigestScheme.java, source/net/yacy/cora/protocol/http/auth/YaCyDigestSchemeFactory.java
Sun Mar 26 10:59:04 CEST 2017
by luccioman
Updated dump/restore shell scripts : the API is now IndexExport_p.html
Changed Files: bin/indexdump.sh, bin/indexrestore.sh
Tue Mar 21 01:16:16 CET 2017
by reger
Update master lng file with added text in Settings_ServerAccess
remove outdated file entry in fr.lng & sk.lng
Changed Files: README.md, locales/fr.lng, locales/master.lng.xlf, locales/sk.lng
Mon Mar 20 02:33:21 CET 2017
by reger
Add hint how to build with maven (for the first time) to readme
Changed Files: README.md
Sun Mar 19 21:45:33 CET 2017
by reger
Add hint text to default ServerAcess Port Settings page
Changed Files: htroot/Settings_ServerAccess.inc
Sun Mar 19 07:12:35 CET 2017
by reger
Display the local search word statistic in alphabetic order
Changed Files: htroot/AccessTracker_p.java, source/net/yacy/cora/sorting/OrderedScoreMap.java
Sat Mar 18 20:32:53 CET 2017
by reger
upd to slf4j-1.7.24.jar
Changed Files: .classpath, build.xml, lib/jcl-over-slf4j-1.7.24.jar, lib/log4j-over-slf4j-1.7.24.jar, lib/slf4j-api-1.7.24.jar, lib/slf4j-jdk14-1.7.24.jar, pom.xml
Sat Mar 18 20:06:58 CET 2017
by reger
upd to icu4j-58_2.jar
Changed Files: .classpath, build.xml, lib/icu4j-58_2.jar, pom.xml
Fri Mar 17 02:19:33 CET 2017
by reger
update to jsoup-1.10.2.jar
Changed Files: .classpath, build.xml, lib/jsoup-1.10.2.jar, pom.xml
Fri Mar 17 02:07:02 CET 2017
by reger
update to jsch-0.1.54.jar
Changed Files: .classpath, build.xml, lib/jsch-0.1.54.License, lib/jsch-0.1.54.jar, pom.xml
Wed Mar 15 22:36:53 CET 2017
by reger
update translation for ConfigNetwork_p.html
Changed Files: htroot/ConfigNetwork_p.html, locales/de.lng, locales/master.lng.xlf
Wed Mar 15 01:39:15 CET 2017
by reger
make digest default authentication in defaults/web.xml
Changed Files: defaults/web.xml
Mon Mar 13 03:08:44 CET 2017
by reger
remove double occuance of geo:lat in rss tokens
Changed Files: source/net/yacy/cora/document/feed/RSSMessage.java
Mon Mar 13 00:34:40 CET 2017
by reger
upd to metadata-extractor-2.10.1.jar
Changed Files: .classpath, build.xml, lib/metadata-extractor-2.10.1.License, lib/metadata-extractor-2.10.1.jar, pom.xml
Sun Mar 12 01:54:56 CET 2017
by reger
implement RequestHeader getRequestURI, getRequestURL for legacy request
Changed Files: source/net/yacy/cora/protocol/RequestHeader.java
Thu Mar 09 22:57:51 CET 2017
by reger
remove unused import pdfParser
Changed Files: source/net/yacy/document/parser/pdfParser.java
Thu Mar 09 22:56:33 CET 2017
by reger
Improve pdf text extraction resource handling.
For sort pdf <= 3 pages use already extracted content,
only for long pdf > 3 pages reassign content and close internal writer (to direct free buffers)
Changed Files: source/net/yacy/document/parser/pdfParser.java
Thu Mar 09 22:50:19 CET 2017
by reger
upd to pdfbox-2.0.4.jar
Changed Files: .classpath, build.xml, lib/fontbox-2.0.4.License, lib/fontbox-2.0.4.jar, lib/pdfbox-2.0.4.License, lib/pdfbox-2.0.4.jar, pom.xml
Thu Mar 09 01:42:36 CET 2017
by reger
eliminate some compiler unchecked and deprecation warnings
in nav plugins by explicite type declaration and replacing date.getYear
with Calendar.get
Changed Files: source/net/yacy/search/navigator/NavigatorPlugins.java, source/net/yacy/search/navigator/YearNavigator.java
Wed Mar 08 22:35:48 CET 2017
by reger
upd to httpclient v4.5.3
Changed Files: .classpath, build.xml, lib/httpclient-4.5.3.jar, lib/httpcore-4.4.6.License, lib/httpcore-4.4.6.jar, lib/httpmime-4.5.3.jar, pom.xml
Wed Mar 08 10:27:18 CET 2017
by luccioman
Fixed unresolved pattern case in search results progress bar.

This is a fix for mantis 715 (http://mantis.tokeek.de/view.php?id=715).

A possible path scenario that could leading to this case :
 - YaCy is running low in memory
 - a search is requested
 - before the end of search results rendering, the cleanup job runs and
deletes the running search event from the cache because of short memory
 - then yacysearchitem renders with "-UNRESOLVED_PATTERN-" parameter
values passed to the statistics() JavaScript function
Changed Files: htroot/yacysearchitem.html, htroot/yacysearchitem.java
Sun Mar 05 02:26:10 CET 2017
by reger
Extend DCEntry.getLanguage convert to ISO639-1 codes for more languages
by using icu.ULocale for languages not already covered (ICU normalizes 
to ISO639-1 2 char codes).
Add test class
Use DublinCore vocabulary declarations in DCEntry and SurrogateReader 
for easier usage debugging, 
Init SurrogateReader.inputSource on first use.

Changed Files: source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/content/SurrogateReader.java, test/java/net/yacy/document/content/DCEntryTest.java
Sat Mar 04 22:45:17 CET 2017
by reger
further avoid to set connect info properties as header value
following comment "use of properties as header values is discouraged"
in case where (proxy)HTTPClient overwrites values with supplied url.
Use defined request.referer procedure in response class.
Changed Files: source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/http/servlets/YaCyProxyServlet.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Sat Mar 04 19:41:31 CET 2017
by reger
use pre-defined "Connection" header key, replace depreceated
Changed Files: source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/protocol/http/HTTPClient.java
Fri Mar 03 12:05:30 CET 2017
by luccioman
Added an advanced settings page for referrer policy settings.

Feedback will be welcome, notably on the descriptive content of this
page.
Changed Files: htroot/SettingsAck_p.html, htroot/SettingsAck_p.java, htroot/Settings_Referrer.inc, htroot/Settings_p.html, htroot/Settings_p.java, source/net/yacy/http/ReferrerPolicy.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/search/SwitchboardConstants.java
Fri Mar 03 00:21:56 CET 2017
by reger
fix proxyservlet response url to respect http scheme if a relative 
Location header is returned.
Changed Files: source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/http/servlets/YaCyProxyServlet.java
Wed Mar 01 09:43:00 CET 2017
by luccioman
Updated Archive-It heuristics URL.

The archive-it OpenSearch URL requested without restriction on
collections ("i" parameter) almost always ends up with timeout or fails.
Changed Files: defaults/heuristicopensearch.conf
Mon Feb 27 23:00:46 CET 2017
by reger
fixed ReindexSolrBusyThread new and unexpected repeat of same query with
low number of found documents - by adding additional end condition to 
remove processed query with number of found docs <= process-chunck-size.

Noticed on query h4_txt:[* TO *], found 21, process 21, call of commit happend
but on next cycle same query again 21 docs found (while h4_txt was removed 
from schema and committed inputdocuments).
Changed Files: source/net/yacy/search/index/ReindexSolrBusyThread.java
Mon Feb 27 01:04:31 CET 2017
by reger
fix delta time calculation in PerformanceSearch_p for the 1. entry
(INITIALIZATION displayed absolute date, set delta to 0 for 1. entry)
Changed Files: htroot/PerformanceSearch_p.java
Sun Feb 26 11:03:15 CET 2017
by luccioman
Fixed datacite.org heuristics base url.

The datacite Solr search http URL was returning http status 301 in order
to redirect to its https version, thus making that YaCy heuristic always
fail.
Changed Files: defaults/federatecfg/datacite.solr.schema
Sun Feb 26 02:39:52 CET 2017
by reger
Adjust DefaultServlet test case to recent change,
depreciate unused CONNECTION_PROP_PROTOCOL (also as it might be 
misleading with getProtocol vs getScheme)
Changed Files: source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/cora/protocol/RequestHeader.java, test/java/net/yacy/http/servlets/YaCyDefaultServletTest.java
Sat Feb 25 23:55:17 CET 2017
by reger
Fix call parameter for ConnectionInfo in MonitorHandler
(expected scheme e.g. http, was protocol version).
Depreceate obsolete custom X-...-Scheme header constant.
Use existing FORMAT_ANSIC Dateformatter in HeaderFramework.
Correct htmlParserTest (del one not intended println)
Changed Files: source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/http/MonitorHandler.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, test/java/net/yacy/document/parser/htmlParserTest.java
Fri Feb 24 11:09:42 CET 2017
by luccioman
Added a hint title for required fields in the Solr Schema editor
Changed Files: htroot/IndexSchema_p.html
Fri Feb 24 11:08:18 CET 2017
by luccioman
Switched a few more Solr fields from strictly mandatory to optional
Changed Files: defaults/solr.collection.schema, source/net/yacy/search/schema/CollectionSchema.java
Fri Feb 24 01:25:32 CET 2017
by reger
fix htmlParser <script> text extraction on code containing expression
recognized as tag like 1<a
reported in https://github.com/yacy/yacy_search_server/issues/109

Script content is ignored by default, but the text is filtered for html
tags. Modified scraper to skip tag filtering while within a <script> 
section (until a closing tag is detected </script>. 
Possible side effect, missing </script> end-tag will truncate trailing 
content text.
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/TransformerWriter.java, test/java/net/yacy/document/parser/htmlParserTest.java
Thu Feb 23 11:09:43 CET 2017
by luccioman
Improved MultiprocotolURL non ASCII characters support.

After @sinkuu Pull Request #108 added JUnit tests, updated some JavaDoc
and also improved URL tokenization to support non ASCII characters.
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java, test/java/net/yacy/cora/document/id/MultiProtocolURLTest.java
Thu Feb 23 07:52:55 CET 2017
by luccioman
Merge pull request #110 from goofy-bz/patch-1

Fixing some typos
Changed Files: locales/fr.lng
Thu Feb 23 01:13:31 CET 2017
by goofy-bz
Fixing some typos

up to line #1000 only
Changed Files: locales/fr.lng
Thu Feb 23 00:27:56 CET 2017
by reger
Correct dublincore title property text to lowercase in htmlresponsewriter,
remove unused (carry over) local variable
Do the same for other responsewriter.
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java
Wed Feb 22 02:01:48 CET 2017
by Burkhard
Update SearchEvent.java

Fix NPE on disabled local SolrIndex, occuring on search moving to the 2nd result page.
The debug purpose only setting to disabeling local SolrIndex (System Admin -> Debug Settings) should long term probably be removed from production code.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Feb 21 22:59:11 CET 2017
by luccioman
Switched some Solr fields from mandatory to optional

These fields are default enabled but with no doubt not strictly
mandatory with the current code base.

As reported by @reger24, splitting between essential mandatory and
optional fields is still to be improved to reflect the current YaCy
needs.
Changed Files: defaults/solr.collection.schema, source/net/yacy/search/schema/CollectionSchema.java
Mon Feb 20 23:27:33 CET 2017
by reger
Add extract of queries.log in form of top search word cloud (last 7 days)
to AccessTracker_p.html (Network Access -> Local Search Log page).
It displays top 20 words of search queries.
Changed Files: htroot/AccessTracker_p.html, htroot/AccessTracker_p.java
Mon Feb 20 00:14:14 CET 2017
by reger
correct fromDate init value on missing param in api/timeline_p servlet
revert test modification from last commit in AccessTracker.main
Changed Files: htroot/api/timeline_p.java, source/net/yacy/search/query/AccessTracker.java
Sun Feb 19 05:23:17 CET 2017
by reger
add hint of query syntax in AccessTracker log (qs=normal querystring,
sq=solr-querystring) to allow to filter simple text queries for processing,
remove toString for counter parameter
use more predefined constants in solrservlet
Changed Files: source/net/yacy/http/servlets/GSAsearchServlet.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/query/AccessTracker.java
Fri Feb 17 11:09:30 CET 2017
by luccioman
Fixed a NullPointerException case possible on Index Export

As reported by Palulukas in YaCy forum
(http://forum.yacy-websuche.de/viewtopic.php?f=18&t=5944&sid=dcef5b899ab4aa9b40e3a3d158c13aed#p33454)
the Index Export operation can fails, notably when the Solr index
contains one or more documents with empty (despite required)
"load_date_dt" field.

This fixes the export failure when the situation finally occurs, but
more should be done to harden verifications on minimum required fields.
Changed Files: source/net/yacy/search/index/Fulltext.java
Thu Feb 16 01:43:14 CET 2017
by reger
Reduce self generated content for text_t (visible text index field) 
to avoid repeat of tokenized url as description,
continuation of https://github.com/yacy/yacy_search_server/commit/7e09bff4a1a117d2f2336e004ec67ffb325a7e9d
https://github.com/yacy/yacy_search_server/commit/1409cabe8b7bce1fb767f01665d9d7e0a91a81b6
Add some javadoc, and not needed remove of omitted fields in postprocessing.
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Wed Feb 15 23:26:54 CET 2017
by reger
removed faroo news from default opensearch config
As @luccioman informed, it's only useable with a free api key
http://www.faroo.com/hp/api/api.html
http://blog.faroo.com/2013/06/30/faroo-introduces-an-api-key/
Changed Files: defaults/heuristicopensearch.conf
Wed Feb 15 15:04:40 CET 2017
by luccioman
Added robots.txt support for heuristics federated search.

As noticed by @reger24, abusive use of OpenSearch systems should be
prevented, especially if allowing to parse and reuse HTML results.
robots.txt file is now checked before requesting an external OpenSearch
system to respect the host exclusions and eventual crawl-delay value.
The check is also performed when trying to add a new OpenSearch URL
template through the /ConfigHeuristics_p.html admin page.
Changed Files: htroot/ConfigHeuristics_p.java, source/net/yacy/cora/federate/FederateSearchManager.java
Sat Feb 11 08:10:14 CET 2017
by sinkuu
Use java.net.URLDecoder
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Tue Feb 14 02:30:26 CET 2017
by reger
adjust translation to renamed configparser_p.html
Changed Files: locales/cn.lng, locales/de.lng, locales/hi.lng, locales/ja.lng, locales/master.lng.xlf, locales/ru.lng, locales/uk.lng
Tue Feb 14 02:04:42 CET 2017
by reger
make ConfigParser a protected page, for consistent behavior of locked
menu items.
Changed Files: htroot/ConfigParser_p.html, htroot/ConfigParser_p.java, htroot/env/templates/submenuCrawler.template
Tue Feb 14 00:31:32 CET 2017
by reger
update opensearch conf - remove suche.sueddeutsche.de
apparently they've revoked the participation in opensearch initiative.
Changed Files: defaults/heuristicopensearch.conf
Fri Feb 10 09:40:42 CET 2017
by luccioman
Upgraded Apache Ant to 1.10.1 in the Docker alpine flavor image

For a more reliable Docker image build, also switched to the ant archive
repository to fetch the needed binary as other repositories only provide
the latest versions.
Changed Files: docker/Dockerfile.alpine
Thu Feb 09 16:42:21 CET 2017
by luccioman
Replaced absolute redirection locations by relative ones when possible.

This makes integration of YaCy behind a reverse proxy subfolder easier.
Changed Files: htroot/Blacklist_p.java, htroot/Status.java, htroot/Wiki.java, source/net/yacy/repository/BlacklistHelper.java
Mon Feb 06 12:41:24 CET 2017
by luccioman
Improved termination of timed out remote solr requests to peers.

On timeout, closing remote Solr requests is proper than simply using
Thread.interrupt() that is not effective in most cases. Closing does not
ask commit on remote solr, but release http connections resources and is
more likely to end those threads that can else wait indefinitely.

Other related improvements included :
 - no more marking remote peer as not available when remote search is
interrupted before timeout by the cleanup job.
 - added a short fine log level trace of failing remote solr requests
Changed Files: source/net/yacy/peers/Protocol.java
Fri Feb 03 10:32:31 CET 2017
by luccioman
Removed deprecated "localMissCount" prop from yacysearchlatestinfo.json.

This property has been deprecated four years ago by commit
d74472f5625ff097e7541e1a56156cbe487b2651. For any active search event
id, it was then always filled with "-UNRESOLVED_PATTERN-".
Changed Files: htroot/yacysearchlatestinfo.java, htroot/yacysearchlatestinfo.json
Fri Feb 03 09:55:08 CET 2017
by luccioman
Named a Thread without name for easier monitoring
Changed Files: source/net/yacy/search/query/SearchEvent.java
Fri Feb 03 09:54:29 CET 2017
by luccioman
Distinguished solr connectors thread names for easier monitoring.
Changed Files: source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java
Wed Feb 01 18:44:42 CET 2017
by luccioman
Refactored the DHT-Trigger section in Performance_p.html page.

This is to be more easily understandable and to reflect more accurately
the current memory strategies implementations that eventually set the
"proper" state not only because DHT reception.
Changed Files: htroot/Performance_p.html, locales/cn.lng, locales/de.lng, locales/fr.lng, locales/master.lng.xlf, locales/ru.lng, locales/uk.lng
Tue Jan 31 16:33:17 CET 2017
by luccioman
Updated French translation for the /Performance_p.html page.

Also updated the master xliff file with missing recent changes.
Changed Files: locales/fr.lng, locales/master.lng.xlf
Tue Jan 31 09:20:19 CET 2017
by luccioman
Fixed unresolved pattern on directory entries in HostBrowser.xml api.

As described in mantis 725 (http://mantis.tokeek.de/view.php?id=725) the
HostBrowser.xml api directory entries had incorrect count attribute
value. 
This was because the HostBrowser html page and backing template servlet
evolved, but modifications were not reported on the xml api.
Changed Files: htroot/HostBrowser.xml
Mon Jan 30 22:44:28 CET 2017
by reger
adjust column layout in Settings_Proxy.inc
Changed Files: htroot/Settings_Proxy.inc
Sat Jan 28 10:19:39 CET 2017
by luccioman
Added a CSS class for infobox block.

This will prevent mistakenly hiding a div element not designed to be an
infobox but having a ".info" parent (After having previously added the
possibility for a div - and not only a span element - to be an infobox).
Changed Files: htroot/Performance_p.html, htroot/env/base.css
Sat Jan 28 01:13:57 CET 2017
by reger
Update language file de & master, remove obsolete "Augmented Browsing"
Changed Files: locales/de.lng, locales/master.lng.xlf
Sat Jan 28 00:36:03 CET 2017
by reger
Add consistency check for related index fields upon load and save of 
index schema.
To assemble the original link url for out-/inboundlinks, icons and pictures
the *_protocol_sxt and *_urlstub_sxt is needed (due to the used data-reduced
storage methode). Auto-enable *_protocol_sxt if *_urlstub_sxt is enabled.
to be able to correctly assemble the original link url.
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Thu Jan 26 23:49:15 CET 2017
by reger
adjust the Field-Reindex Thread to verify and update the document id
in case hash (ID) doesn't match document url (sku field).
Changed Files: source/net/yacy/search/index/ReindexSolrBusyThread.java
Thu Jan 26 06:37:29 CET 2017
by Michael Christen
Merge pull request #98 from Velociraptor85/patch-2

LSB Tag
Changed Files: addon/yacyInit.sh
Thu Jan 26 06:29:42 CET 2017
by Michael Christen
Merge pull request #105 from ivar/patch-1

Update README.md - removes deprecated URL
Changed Files: README.md
Thu Jan 26 05:36:48 CET 2017
by Ivar Vasara
Update README.md - removes deprecated URL
Changed Files: README.md
Thu Jan 26 01:13:32 CET 2017
by luccioman
Improved Index Browser accessibility with semantically richer html tags.

Made use of ol, li, thead, th, tbody, h1 and h2 html tags.
Added aria-label attributes to provide alternative textual information
previously only conveyed by color cue.

Tested behavior with NVDA 2016.4 screen reader.
Changed Files: htroot/HostBrowser.html
Wed Jan 25 09:54:39 CET 2017
by luccioman
Fixed local image search pagination regression.

As reported by @tglman on issue #90, when searching images on the local
index only, pages next to the first were always empty. This was a
regression from commit c25e48e969f180dcc3c73863acbfcc383a182c8f.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Jan 24 17:14:14 CET 2017
by luccioman
Updated master xliff file with missing entries for HostBrowser.html.

Also translated lang="en" html attribute to lang="[targetLang]" on
locale files having translated entries for HostBrowser.html
Changed Files: locales/de.lng, locales/fr.lng, locales/master.lng.xlf, locales/ru.lng
Tue Jan 24 15:56:29 CET 2017
by Michael Peter Christen
added dc.date.modified and dc.date.created to date parser
Changed Files: source/net/yacy/document/parser/html/ContentScraper.java
Tue Jan 24 11:38:56 CET 2017
by luccioman
Updated French translation of HostBrowser.html
Changed Files: locales/fr.lng
Tue Jan 24 09:40:43 CET 2017
by luccioman
Fixed Index Browser page HTML validation errors and switched to HTML5.

Also removed deprecated HTML attributes uses.

Validation performed with Nu Html Checker 17.1.0.

Cross browser tested with :
 - Debian Jessie : Firefox ESR 45.6.0
 - MS Windows 10 : Firefox 50.1.0, Chrome 55.0.2883.87, MS Edge
Changed Files: htroot/HostBrowser.html, htroot/HostBrowser.java, htroot/HostBrowserAdmin_p.html
Tue Jan 24 01:51:28 CET 2017
by reger
assure that RWI Index.Segment IODispatcher is not blocking on shudown
waiting on a semaphore permit.
see desc. http://mantis.tokeek.de/view.php?id=723
Changed Files: source/net/yacy/kelondro/rwi/IODispatcher.java
Mon Jan 23 16:05:51 CET 2017
by luccioman
Documented /HostBrowser.html related configuration settings
Changed Files: defaults/yacy.init, htroot/HostBrowser.java
Mon Jan 23 14:49:02 CET 2017
by luccioman
Display Index Browser links requiring auth only when authenticated.

In the /HostBrowser.html page "only hosts with urls pending in the
crawler", "only with load errors" and "Administration Options" all
require administration credentials. But they were displayed even to
unauthenticated users, and clicking them did nothing and returned the
/HostBrowser.html page empty.
Changed Files: htroot/HostBrowser.html, htroot/HostBrowser.java
Sun Jan 22 12:31:14 CET 2017
by luccioman
Fixed display of crawler pending URLs counts in HostBrowser.html page.

As described in mantis 722 (http://mantis.tokeek.de/view.php?id=722)

Also updated some Javadoc.
Changed Files: htroot/HostBrowser.java, source/net/yacy/crawler/Balancer.java, source/net/yacy/crawler/HostBalancer.java, source/net/yacy/crawler/data/NoticedURL.java
Sun Jan 22 12:19:43 CET 2017
by luccioman
Removed temporary test main method commited by mistake. 
Changed Files: htroot/yacysearch.java
Sun Jan 22 00:01:18 CET 2017
by reger
add ukr and pol to DCEntry.getLanguage ISO639-2 3-char language code 
conversion to deliver uk, pl 2-char code
and use if else to return on match
Changed Files: source/net/yacy/document/content/DCEntry.java
Sat Jan 21 01:53:43 CET 2017
by reger
delete outdated and unmaintained Netbeans project
Netbeans has good build-in maven support which is a supported and 
maintained build env, making special and additional NB setting obsolete.
Changed Files:
Fri Jan 20 02:15:11 CET 2017
by reger
upd to commons-compress-1.13.jar
hide external icon on forge logo (was also out of position in IE)
Changed Files: .classpath, build.xml, htroot/Status.html, lib/commons-compress-1.13.License, lib/commons-compress-1.13.jar, pom.xml
Thu Jan 19 12:30:44 CET 2017
by luccioman
Added an optional parameter to webstructure.xml api.

This new "documentStructure" parameter can be set to false to only get
hosts accumulated references on a resource and thus prevent scraping the
specified URL and getting citations references.

Also set WebStructureGraph constants as final and updated the Javadoc
with example api call URLs.  
Changed Files: htroot/api/webstructure.java, source/net/yacy/peers/graphics/WebStructureGraph.java
Tue Jan 17 23:45:56 CET 2017
by reger
remove obsolete lastmodified calculation in WebgraphConfig
Changed Files: source/net/yacy/search/schema/WebgraphConfiguration.java
Tue Jan 17 17:01:56 CET 2017
by luccioman
Updated Javadoc and Junit tests for the WebStructureGraph class.
Changed Files: source/net/yacy/peers/graphics/WebStructureGraph.java, test/java/net/yacy/peers/graphics/WebStructureGraphTest.java
Tue Jan 17 15:59:55 CET 2017
by luccioman
Made sure webstructure.xml API produces valid XML.

Host names should not contain XML special characters such as quotation
mark, but at this stage the WebGraph may have mistakenly recorded a host
name with such characters. What's more the DigestURL constructor does
not prevent this.
By the way using serverObjects.putXML to encode host names we ensure
here the rendered XML is well formed and can be parsed by external tools
even if an structure entry is incorrect.
Changed Files: htroot/api/webstructure.java
Mon Jan 16 18:41:58 CET 2017
by luccioman
Fixed WatchWebStructure_p.html render to include https URLs.

As described in mantis 721 (http://mantis.tokeek.de/view.php?id=721)
WatchWebStructure_p.html failed to include in its structure view https
and other protocols and ports than default http.
Changed Files: htroot/WebStructurePicture_p.java, source/net/yacy/peers/graphics/WebStructureGraph.java, test/java/net/yacy/peers/graphics/WebStructureGraphTest.java
Mon Jan 16 16:41:06 CET 2017
by luccioman
Fixed webstructure.xml API used with a domain name 'about' parameter.

As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720),
when requesting this API with a domain name instead of a complete URL
only HTTP references on default port were listed.
Changed Files: htroot/api/webstructure.java, source/net/yacy/peers/graphics/WebStructureGraph.java, test/java/net/yacy/peers/graphics/WebStructureGraphTest.java
Mon Jan 16 10:18:42 CET 2017
by luccioman
Factored code re-implementing DigestURL.hosthash() method.

This ensure consistent implementation of the url host hash generation
and easier usage finding in source code.

Also added a unit test for this function.
Changed Files: htroot/WebStructurePicture_p.java, source/net/yacy/cora/document/id/DigestURL.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/graphics/WebStructureGraph.java, source/net/yacy/search/Switchboard.java, test/java/net/yacy/cora/document/id/DigestURLTest.java
Fri Jan 13 16:10:59 CET 2017
by luccioman
Added automated unit tests and perfs test for WebStructureGraph class.

Fixed references count when multiple links target the same domain name
in one document.
Changed Files: source/net/yacy/peers/graphics/WebStructureGraph.java, test/java/net/yacy/peers/graphics/WebStructureGraphTest.java
Fri Jan 13 16:05:46 CET 2017
by luccioman
Factored common code with DigestURL.hosthash()
Changed Files: htroot/HostBrowser.java, htroot/api/webstructure.java
Thu Jan 12 17:52:47 CET 2017
by luccioman
Detailed some Javadoc related to /api/webstructure.xml usage.
Changed Files: htroot/api/webstructure.java, source/net/yacy/peers/graphics/WebStructureGraph.java
Thu Jan 12 01:36:30 CET 2017
by reger
Start to rename "Augmented Browsing" to "Web Proxy ..." / "View via Proxy"
The augmented Browsing option was reduced to the web proxy functionallity.
Augmented browsing is not available and no known plan exist to reimplement
alteration of result pages with additional information.
Changed Files: htroot/AugmentedBrowsing_p.html, htroot/ConfigSearchPage_p.html, htroot/yacysearchitem.html, locales/de.lng, locales/master.lng.xlf
Mon Jan 09 16:45:31 CET 2017
by luccioman
Ignore generated Javadoc with git SCM.
Changed Files: .gitignore
Sat Jan 07 18:24:29 CET 2017
by reger
fix DC.Elements namespace in DublinCore vocabulary class
delete redundant (unused) DCElements.
Changed Files: source/net/yacy/cora/lod/vocabulary/DublinCore.java
Fri Jan 06 12:24:31 CET 2017
by luccioman
Blacklist import and update performance improvements.

Measurement sample : import from blacklist local file containing about
15000 entries
 - before refactoring : several minutes
 - after refactoring : a few seconds!
Changed Files: htroot/BlacklistCleaner_p.java, htroot/IndexControlRWIs_p.java, htroot/sharedBlacklist_p.java, source/net/yacy/repository/Blacklist.java, source/net/yacy/repository/BlacklistHostAndPath.java
Fri Jan 06 11:23:40 CET 2017
by luccioman
Added some JavaDoc.
Changed Files: htroot/sharedBlacklist_p.java, source/net/yacy/server/serverObjects.java
Fri Jan 06 09:00:28 CET 2017
by luccioman
Display result favicons only for http or https resources.

Favicon display only makes sense for http(s) websites, being public or
intranet. So I modified the favicon conditional display to verify the
result URL protocol rather than if we are in intranet mode.

Also prevented rendering an img HTML tag with empty src on other results
protocols such as ftp or file.

Fixing this thanks to priest2 report
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5923).
Changed Files: htroot/yacysearchitem.html, htroot/yacysearchitem.java, htroot/yacysearchitem.json
Fri Jan 06 03:01:52 CET 2017
by reger
fix concurrency issue with htmlParser using not current scraper data
resulting in incorrect data for some html index metadata.
Details see http://mantis.tokeek.de/view.php?id=717
Changed Files: source/net/yacy/document/AbstractParser.java, source/net/yacy/document/Document.java, source/net/yacy/document/content/DCEntry.java, source/net/yacy/document/parser/genericParser.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Jan 05 14:54:59 CET 2017
by luccioman
Added descriptive titles to Crawler_p.html speed settings.

As reported by bubul
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5924) , LF and MH
acronyms meaning were not detailed.
Also added label tags for improved accessibility on these input fields.
Changed Files: htroot/Crawler_p.html
Thu Jan 05 00:24:37 CET 2017
by reger
fix exception on URIMetadataNote instantiation with corrected id hash on
host_id_s. Use Solr setField instead of addField to prevent
java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String
	at net.yacy.kelondro.data.meta.URIMetadataNode.hosthash(URIMetadataNode.java:247)
	at net.yacy.search.query.SearchEvent.addNodes(SearchEvent.java:966)
	at net.yacy.peers.Protocol.solrQuery(Protocol.java:1242)
	at net.yacy.peers.RemoteSearch$2.run(RemoteSearch.java:349)
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Mon Jan 02 14:23:25 CET 2017
by luccioman
Upgraded Apache Ant to 1.10.0 for the Alpine flavor Docker image. 
Changed Files: docker/Dockerfile.alpine
Mon Jan 02 10:24:17 CET 2017
by luccioman
Adjusted crawl depth control for FTP crawl start URLs.
Changed Files: source/net/yacy/crawler/CrawlStacker.java
Mon Jan 02 03:04:21 CET 2017
by reger
Complete harmonization RequestHeader getCookie with std ServletRequest
to use javax.servlet.http.Cookie parameters.
Depreciate now obsolete getHeaderCookies.
Adjust setting of MaxAge to spec if >= 0 otherwise keep default.
Changed Files: htroot/CookieTest_p.java, htroot/User.java, source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/cora/protocol/ResponseHeader.java, source/net/yacy/data/UserDB.java, source/net/yacy/search/Switchboard.java
Sun Jan 01 23:58:38 CET 2017
by reger
On negative result vote also delete document from fulltext index
(not only from dht)
Changed Files: htroot/yacysearch.java
Sun Jan 01 23:54:18 CET 2017
by reger
Merge origin/master
Changed Files: docker/Dockerfile, docker/Dockerfile.alpine, docker/Readme.md, startYACY.sh
Sun Jan 01 23:53:44 CET 2017
by reger
fix of fulltext.remove() by id of webgraph document
webgraph has document hash in source_id_s
Changed Files: source/net/yacy/search/index/Fulltext.java
Sat Dec 31 09:51:07 CET 2016
by luccioman
Fixed docker stop behavior.

- Adjusted start script in debug mode to make sure the main java process
can receive signals such as SIGTERM
- Modified docker images main command to properly propagate SIGTERM
signal to the main java process
Changed Files: docker/Dockerfile, docker/Dockerfile.alpine, docker/Readme.md, startYACY.sh
Wed Dec 28 09:47:27 CET 2016
by luccioman
Fixed YaCy proper shutdown triggered by SIGTERM signal.

The main shutdown hook thread was not properly waiting for the main
thread termination which consequently could not properly close resources
and threads. After terminating a running YaCy peer this way (Ctrl+C in
console, or kill <pid> for example), you could see the still existing
DATA/yacy.running file.

Tested with :
 - Debian Jessie openjdk 7 and 8 : regular shutdown, Ctrl+C, kill
command, system restart while yacy is running
 - Windows 10 Oracle JDK 7 and 8 : non regression on regular shutdown 
Changed Files: source/net/yacy/search/Switchboard.java, source/net/yacy/yacy.java