<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Split each xml attribute  into separate tables stores in hive using nifi.. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219913#M69620</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; Thank you for response, here are my input file(&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41460-xml.xml"&gt;xml.xml&lt;/A&gt;). And i attached my expected output files &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41461-output-author.txt"&gt;output-author.txt&lt;/A&gt;, &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41462-output-books.txt"&gt;output-books.txt&lt;/A&gt;, &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41463-output-bookstore.txt"&gt;output-bookstore.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt; . Hope you will understand my problem. &lt;/P&gt;</description>
    <pubDate>Sun, 22 Oct 2017 04:14:12 GMT</pubDate>
    <dc:creator>msure4</dc:creator>
    <dc:date>2017-10-22T04:14:12Z</dc:date>
    <item>
      <title>Split each xml attribute  into separate tables stores in hive using nifi..</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219912#M69619</link>
      <description>&lt;P&gt;Hello,  I’m very new for nifi and new for programming language. Here are my scenario I’m getting different type of nested XML files from HTTP or SFTP or local drives. I have to split the those XML files based on nested (child elements). Same child elements will have to save in same file and they have some primary key or unique key have to know the relationship of the parent and child. &lt;/P&gt;&lt;P&gt; EX: A(root)-&amp;gt; B(Child)-&amp;gt;  C (Child of B)-&amp;gt;  D(Child of C)&lt;/P&gt;&lt;P&gt;A(root)-&amp;gt; B(Child)  C (Child of B)  D(Child of C)                  B(Child)  C (Child of B)  D(Child of C) &lt;/P&gt;&lt;P&gt;Now I need all data of B’s have to save in one table and C’s in another table and so on.., And must have to maintain the relationship between the child and parents with some unique key (if you have use otherwise we have to generate the unique keys for identification of relationship between the tables.).&lt;/P&gt;&lt;P&gt;Same as the below example:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;customer&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;group&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;site&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;/site&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;/group&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;group&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;site&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;userline&amp;gt;&amp;lt;/userline&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;/site&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;/group&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;lt;/customer&amp;gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt; I saw already same solution for above in "https://community.hortonworks.com/questions/70087/complex-xml-to-hive-table-using-nifi.html" BUt i didn't understand solution so can someone help me the step by step to split XML for separate files.&lt;/P&gt;&lt;P&gt;  Thank you in Advance..&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 22:41:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219912#M69619</guid>
      <dc:creator>msure4</dc:creator>
      <dc:date>2017-10-13T22:41:50Z</dc:date>
    </item>
    <item>
      <title>Re: Split each xml attribute  into separate tables stores in hive using nifi..</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219913#M69620</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; Thank you for response, here are my input file(&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41460-xml.xml"&gt;xml.xml&lt;/A&gt;). And i attached my expected output files &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41461-output-author.txt"&gt;output-author.txt&lt;/A&gt;, &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41462-output-books.txt"&gt;output-books.txt&lt;/A&gt;, &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/41463-output-bookstore.txt"&gt;output-bookstore.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt; . Hope you will understand my problem. &lt;/P&gt;</description>
      <pubDate>Sun, 22 Oct 2017 04:14:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219913#M69620</guid>
      <dc:creator>msure4</dc:creator>
      <dc:date>2017-10-22T04:14:12Z</dc:date>
    </item>
    <item>
      <title>Re: Split each xml attribute  into separate tables stores in hive using nifi..</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219914#M69621</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/141524/split-each-xml-attribute-into-separate-tables-stor.html?childToView=143926#"&gt;@Shu&lt;/A&gt; : I just created output as .txt files for better understanding purpose. But i need to store data in Hive or HDFS location.  &lt;/P&gt;</description>
      <pubDate>Sun, 22 Oct 2017 04:17:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219914#M69621</guid>
      <dc:creator>msure4</dc:creator>
      <dc:date>2017-10-22T04:17:19Z</dc:date>
    </item>
    <item>
      <title>Re: Split each xml attribute  into separate tables stores in hive using nifi..</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219915#M69622</link>
      <description>&lt;P&gt;@Shu &lt;/P&gt;&lt;P&gt;&amp;lt;?xml version="1.0"?&amp;gt;&lt;/P&gt;&lt;P&gt;
&amp;lt;?xml-stylesheet type="text/xsl" href="myfile.xsl" ?&amp;gt; &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;&amp;lt;bookstore specialty="novel"&amp;gt; &lt;/P&gt;&lt;P style="margin-left: 40px;"&gt;  &amp;lt;book style="autobiography"&amp;gt;&lt;/P&gt;&lt;P style="margin-left: 80px;"&gt;&amp;lt;author&amp;gt;
      &amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt; &lt;/P&gt;&lt;P style="margin-left: 80px;"&gt;      &amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt;&lt;/P&gt;&lt;P style="margin-left: 80px;"&gt;
      &amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt;&lt;/P&gt;&lt;P style="margin-left: 80px;"&gt;
    &amp;lt;/author&amp;gt;&lt;/P&gt;&lt;P style="margin-left: 80px;"&gt;
    &amp;lt;price&amp;gt;12&amp;lt;/price&amp;gt; &lt;/P&gt;&lt;P style="margin-left: 60px;"&gt;&amp;lt;/book&amp;gt; &lt;/P&gt;&lt;P style="margin-left: 40px;"&gt;&amp;lt;/bookstore&amp;gt;&lt;/P&gt;</description>
      <pubDate>Sun, 22 Oct 2017 04:37:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219915#M69622</guid>
      <dc:creator>msure4</dc:creator>
      <dc:date>2017-10-22T04:37:06Z</dc:date>
    </item>
    <item>
      <title>Re: Split each xml attribute  into separate tables stores in hive using nifi..</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219916#M69623</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/45440/msure4.html" nodeid="45440" target="_blank"&gt;@Mohan Sure&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;We can get results as you expected by using &lt;/P&gt;&lt;PRE&gt;EvaluateXquery //we can keep all the required contents as attributes of flowfile.
UpdateAttribute //update the contents of attributes that got extracted in evaluatexquery processor.
ReplaceText //replace the flowfile content with attributes of flowfile
PutHDFS //store files into HDFS&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;EvaluateXquery Configurations:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Change the existing properties &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1.Destination to &lt;/P&gt;&lt;PRE&gt;flowfile-attribute&lt;/PRE&gt;&lt;P&gt;2.Output: Omit XML Declaration to&lt;/P&gt;&lt;PRE&gt;true&lt;/PRE&gt;&lt;P&gt;Add new &lt;STRONG&gt;properties&lt;/STRONG&gt; by clicking&lt;STRONG&gt; +&lt;/STRONG&gt; sign&lt;/P&gt;&lt;P&gt;1.author&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;//author&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;2.book&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;//book&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;3.bookstore&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;//bookstore&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Input:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;?xml-stylesheet type="text/xsl" href="myfile.xsl" ?&amp;gt;
&amp;lt;bookstore specialty="novel"&amp;gt;
&amp;lt;book style="autobiography"&amp;gt;
&amp;lt;author&amp;gt; 
&amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt;
&amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt;
&amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt;
&amp;lt;/author&amp;gt;
&amp;lt;price&amp;gt;12&amp;lt;/price&amp;gt;
&amp;lt;/book&amp;gt;
&amp;lt;/bookstore&amp;gt;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="41465-attrjs.png" style="width: 707px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15934i731D1FBB89800CEA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="41465-attrjs.png" alt="41465-attrjs.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;As you can see in screenshot all the content are as attributes(book,bookstore,author) to the flowfile.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;EvaluateXquery Processor configs screenshot:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="41464-evaluatexquery.png" style="width: 574px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15935i5183B1C044DB5263/image-size/medium?v=v2&amp;amp;px=400" role="button" title="41464-evaluatexquery.png" alt="41464-evaluatexquery.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Update Attribute Processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1.author&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;${author:replaceAll('&amp;lt;author&amp;gt;([\s\S]+.*)&amp;lt;\/author&amp;gt;','$1')}&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;updating the author attribute&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;input to updateattribute processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;&amp;lt;author&amp;gt; &amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt; &amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt; &amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt; &amp;lt;/author&amp;gt;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;&amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt; &amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt; &amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt;&lt;/PRE&gt;&lt;P&gt;2.book&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;${book:replaceAll('&amp;lt;book\s(.*)&amp;gt;[\s\S]+&amp;lt;\/author&amp;gt;([\s\S]+)&amp;lt;\/book&amp;gt;','$1$2')}&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Input:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;&amp;lt;book style="autobiography"&amp;gt; &amp;lt;author&amp;gt; &amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt; &amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt; &amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt; &amp;lt;/author&amp;gt; &amp;lt;price&amp;gt;12&amp;lt;/price&amp;gt; &amp;lt;/book&amp;gt;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;style="autobiography" &amp;lt;price&amp;gt;12&amp;lt;/price&amp;gt;&lt;/PRE&gt;&lt;P&gt;3.bookstore&lt;/P&gt;&lt;PRE&gt;${bookstore:replaceAll('.*&amp;lt;bookstore\s(.*?)&amp;gt;[\s\S]+.*','$1')}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Input:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;&amp;lt;bookstore specialty="novel"&amp;gt; &amp;lt;book style="autobiography"&amp;gt; &amp;lt;author&amp;gt; &amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt; &amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt; &amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt; &amp;lt;/author&amp;gt; &amp;lt;price&amp;gt;12&amp;lt;/price&amp;gt; &amp;lt;/book&amp;gt; &amp;lt;/bookstore&amp;gt;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Output:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;specialty="novel"
&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="41466-update-attr.png" style="width: 606px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15936iA9D4EE2F52401F99/image-size/medium?v=v2&amp;amp;px=400" role="button" title="41466-update-attr.png" alt="41466-update-attr.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;ReplaceText Processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;U&gt;C&lt;/U&gt;change the properties of &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Replacement Strategy&lt;/STRONG&gt; to &lt;/P&gt;&lt;PRE&gt;alwaysreplace&lt;/PRE&gt;&lt;P&gt;and use your attributes bookstore,book,author in this processor and we are going to overwrite the existing contents of flowfile with the new content.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="41468-replace-text.png" style="width: 557px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15937i946FB897783B8308/image-size/medium?v=v2&amp;amp;px=400" role="button" title="41468-replace-text.png" alt="41468-replace-text.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;add 2 more replacetext processors for book and author attributes.&lt;/P&gt;&lt;P&gt;Output:-&lt;/P&gt;&lt;P&gt;&amp;lt;first-name&amp;gt;Joe&amp;lt;/first-name&amp;gt;
&amp;lt;last-name&amp;gt;Bob&amp;lt;/last-name&amp;gt;
&amp;lt;award&amp;gt;Trenton Literary Review Honorable Mention&amp;lt;/award&amp;gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;PutHDFS processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Configure the processor and give the directory name where you want to store the data.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Flow Screenshot:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="41469-flow.png" style="width: 1404px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15938i35FE39D8C418DCA8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="41469-flow.png" alt="41469-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;For testing purpose i have use generate flowfile processor but in your case generate flowfile processor will be the source processor from where you are getting this xml data.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 02:34:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-each-xml-attribute-into-separate-tables-stores-in-hive/m-p/219916#M69623</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T02:34:16Z</dc:date>
    </item>
  </channel>
</rss>

