Reply
New Contributor
Posts: 5
Registered: ‎05-10-2017

Solr kite-morphlines-hadoop-sequencefile sample example

Hi,

 

I have got millions of small xml's (in KBs) stored in zipped file at S3. I need to create a solution to provide a search capability over it so that if a request comes up for a certain tag text then all xmls(in whole) containing that text should be returned.

 

I am thinking to create a sequence file for the same to avoid small files problem which will be indexed using Solr. 

 

Questions:

1. I read there is "kite-morphlines-hadoop-sequencefile" morphline command that can be used to read and index it into Solr. Can anyone share any working example of the same ?

 

2. Also, can anyone share any insight on what should I do to send the whole xml/xmls based on search criteria back to the requestor given the fact that it will be in a sequence file.

 

Many Thanks

Announcements
The Kite SDK is a collection of docs, sample code, APIs, and tools to make Hadoop application development faster. Learn more at http://kitesdk.org.