Support Questions

jochen_kempf · ‎06-22-2016

Can anyone point out how the output document of a XML file ingested by the Hadoop-Solr XML Ingest Mapper looks like?

cstanca · ‎12-28-2016

Start here: http://lucene.apache.org/solr/quickstart.html

Search for "Indexing Solr XML" and perform the steps indicated.

In the end, you could browse the documents indexed at http://localhost:8983/solr/gettingstarted/browse. That is how the output you are interested looks like. Of courser, replace "localhost" with your case host in the URL. The /browse UI view defaults to assuming the gettingstarted schema and data are a catch-all mix of structured XML, JSON, CSV example data, and unstructured rich documents. Your own data may not look ideal at first, though the /browse templates are customizable.

View solution in original post

cstanca · ‎12-28-2016

Start here: http://lucene.apache.org/solr/quickstart.html

Search for "Indexing Solr XML" and perform the steps indicated.

In the end, you could browse the documents indexed at http://localhost:8983/solr/gettingstarted/browse. That is how the output you are interested looks like. Of courser, replace "localhost" with your case host in the URL. The /browse UI view defaults to assuming the gettingstarted schema and data are a catch-all mix of structured XML, JSON, CSV example data, and unstructured rich documents. Your own data may not look ideal at first, though the /browse templates are customizable.

Cloudera Community

Support Questions

How does the XML Ingest Mapper for Hadoop-Solr parse a xml file?