Projektbeschreibung

Nutch is highly scalable Web searching software
which builds on top of Apache Hadoop and Lucene
Java. Key features include a Web crawler, indexer,
crawl management tools, parsers for HTML, PDF,
DOC, and several other document formats, and an
expandable architecture that allows you to plug in
additional functionality such as document parsers,
custom scoring algorithms, custom content parsers,
protocols, and more.

(This Description is auto-translated) Try to translate to Japanese Show Original Description

Bewertung
Ihr Bewertung
Rezensionen verfassen