Support Questions
Find answers, ask questions, and share your expertise

Solr kite-morphlines-hadoop-sequencefile sample example


Solr kite-morphlines-hadoop-sequencefile sample example




I have got millions of small xml's (in KBs) stored in zipped file at S3. I need to create a solution to provide a search capability over it so that if a request comes up for a certain tag text then all xmls(in whole) containing that text should be returned.


I am thinking to create a sequence file for the same to avoid small files problem which will be indexed using Solr. 



1. I read there is "kite-morphlines-hadoop-sequencefile" morphline command that can be used to read and index it into Solr. Can anyone share any working example of the same ?


2. Also, can anyone share any insight on what should I do to send the whole xml/xmls based on search criteria back to the requestor given the fact that it will be in a sequence file.


Many Thanks