Created 10-13-2015 04:20 PM
Need Installation and configuration docs for Nutch, Tika, Stanbol if we support it.
Created 10-13-2015 05:23 PM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 10-13-2015 05:23 PM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 10-13-2015 05:55 PM
Thanks Andrew. I found Tika is a library shipped with Solr. Couldnt find Nutch and Stanbol. Will convey to customer about the support as suggested.
Created 12-11-2015 01:27 PM
mmadan: Nutch is a full web crawling system which uses Hadoop. It has been around for many years - and in fact could be credited with creating Hadoop.
I tried supporting Nutch for a while (Not through Hortonworks of course), but it is still very much R&D software because there are so few companies using it. There is some significant confusion about moving away from MapReduce to YARN.
Stanbol is something I am less familiar with - but since it consists of LOTS of Apache projects I think it would be as complicated as Hadoop to support.