<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: index nested structure in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/index-nested-structure/m-p/17712#M2673</link>
    <description>&lt;P&gt;To answer my own question: No this is not possible at this time, since Solr only started supporting nested documents since 4.5 and CDH5.1 is at 4.4 right now. Even if this becomes available in a future release the question will be whether or not this can be easily integrated&amp;nbsp;and used with Kites morphlines.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For getting the job done I had to switch to using ElasticSearch, which does support nested documents and used Flume's ElasticSearchSink. Flume's official documetation on elasticsearch and avro is lacking and I had to patch flume code to get it working with UTF-8 charset and Json, but it's working nonetheless. Hope I can move this dataflow to the better integrated SolrCloud in the future.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Aug 2014 18:57:19 GMT</pubDate>
    <dc:creator>RobV</dc:creator>
    <dc:date>2014-08-27T18:57:19Z</dc:date>
    <item>
      <title>index nested structure</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/index-nested-structure/m-p/17350#M2672</link>
      <description>&lt;P&gt;I have an avro input source, going through a morphline into Solr.&amp;nbsp;For example the following structure:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;{&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; "username" : "alex"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; "date" : "21-08-2014"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; "attachments" : [&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "documents" : [&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "title": "test"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "tags" : [ "a", "b", "c" ]&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; },&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "optional1" : "test2"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "title" : "test2"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "source" : "school"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; ]&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can extract with extractAvroPath, like so:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;{ extractAvroPaths {&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;flatten : true&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;paths : {&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/my_user : /username &amp;nbsp; &amp;nbsp; &amp;nbsp; # this works fine&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/my_attachments : "/attachments[]"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/my_documents : "/attachments[]/documents[]"&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;}&lt;/P&gt;&lt;P&gt;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;.....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem being that /my_attachments or /my_documents now contain raw json/avro structures instead of a single field. How would I go about 'unwrapping' these fields so that they are all part of one solr document, while still retaining their context of the document they belong to?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:05:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/index-nested-structure/m-p/17350#M2672</guid>
      <dc:creator>RobV</dc:creator>
      <dc:date>2022-09-16T09:05:39Z</dc:date>
    </item>
    <item>
      <title>Re: index nested structure</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/index-nested-structure/m-p/17712#M2673</link>
      <description>&lt;P&gt;To answer my own question: No this is not possible at this time, since Solr only started supporting nested documents since 4.5 and CDH5.1 is at 4.4 right now. Even if this becomes available in a future release the question will be whether or not this can be easily integrated&amp;nbsp;and used with Kites morphlines.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For getting the job done I had to switch to using ElasticSearch, which does support nested documents and used Flume's ElasticSearchSink. Flume's official documetation on elasticsearch and avro is lacking and I had to patch flume code to get it working with UTF-8 charset and Json, but it's working nonetheless. Hope I can move this dataflow to the better integrated SolrCloud in the future.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2014 18:57:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/index-nested-structure/m-p/17712#M2673</guid>
      <dc:creator>RobV</dc:creator>
      <dc:date>2014-08-27T18:57:19Z</dc:date>
    </item>
  </channel>
</rss>

