<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Update CSV attribute/Merge CSV files in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197030#M62320</link>
    <description>&lt;P&gt;Hello &lt;A rel="user" href="https://community.cloudera.com/users/13785/arsalan-siddiqi.html" nodeid="13785"&gt;@Arsalan Siddiqi&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I think it's possible with NiFi 1.2.0 or later, by using QueryRecord processor.&lt;/P&gt;&lt;P&gt;The basic idea is using MergeContent to create a single FlowFile, containing all CSV files, when doing so, add a column specifying which attribute (I used 'm' column in my example).&lt;/P&gt;&lt;P&gt;Then use QueryRecord processor to join records and produce a row having different attributes in it:&lt;/P&gt;&lt;PRE&gt;select fa.t, fa.v a, fb.v b, fc.v c
 from (
   select t, v from FLOWFILE where m = 'a'
 ) fa
 left join (
   select t, v from FLOWFILE where m = 'b'
 ) fb on fa.t = fb.t
 left join (
   select t, v from FLOWFILE where m = 'c'
 ) fc on fa.t = fc.t&lt;/PRE&gt;&lt;P&gt;I've created a Gist with NiFi flow template, I hope this helps:&lt;/P&gt;&lt;P&gt;&lt;A href="https://gist.github.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed" target="_blank"&gt;https://gist.github.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 05 Jun 2017 13:11:07 GMT</pubDate>
    <dc:creator>kkawamura</dc:creator>
    <dc:date>2017-06-05T13:11:07Z</dc:date>
    <item>
      <title>Update CSV attribute/Merge CSV files</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197029#M62319</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;I have multiple csv files where each file contains an attribute value at a given time. There are a total of 60 files (60 different attributes). These are basically Spark's Metric Dump. example: &lt;/P&gt;&lt;P&gt;The file name is the name of the application followed by the attribute name.&lt;/P&gt;&lt;P&gt;For the example below the application name is :local-1495979652246 and attribute for the first file is: BlockManager.disk.diskSpaceUsed_MB&lt;/P&gt;&lt;P&gt;local-1495979652246.driver.BlockManager.disk.diskSpaceUsed_MB.csv&lt;/P&gt;&lt;P&gt;local-1495979652246.driver.BlockManager.memory.maxMem_MB.csv&lt;/P&gt;&lt;P&gt;local-1495979652246.driver.BlockManager.memory.memUsed_MB.csv &lt;/P&gt;&lt;P&gt;each file contains values like:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;t&lt;/TD&gt;&lt;TD&gt;value&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1496588167&lt;/TD&gt;&lt;TD&gt;0.003329809088456&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1496588168&lt;/TD&gt;&lt;TD&gt;0.00428465362778284
&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;The file name specifys  the name of the attribute.&lt;/P&gt;&lt;P&gt; The first thing i need to do is to update csv header field called value to the attribute name from the filename &lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;t&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;BlockManager.disk.diskSpaceUsed_MB&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1496588167&lt;/TD&gt;&lt;TD&gt;0.003329809088456&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;the next thing would be to merge all files for the same application based on the value of the filed t. and eventually I should have one csv file for each application containing the values for all the attributes like:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;t&lt;/TD&gt;&lt;TD&gt;BlockManager.disk.diskSpaceUsed_MB&lt;/TD&gt;&lt;TD&gt;BlockManager.memory.maxMem_MB&lt;/TD&gt;&lt;TD&gt;BlockManager.memory.memUsed_MB&lt;/TD&gt;&lt;TD&gt;more attributes...&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1496588167&lt;/TD&gt;&lt;TD&gt;0.003329809088456&lt;/TD&gt;&lt;TD&gt;some value&lt;/TD&gt;&lt;TD&gt;some value&lt;/TD&gt;&lt;TD&gt;some value&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1496588168&lt;/TD&gt;&lt;TD&gt;0.00428465362778284&lt;/TD&gt;&lt;TD&gt;some value&lt;/TD&gt;&lt;TD&gt;come value&lt;/TD&gt;&lt;TD&gt;..&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;any suggestions? &lt;/P&gt;</description>
      <pubDate>Sun, 04 Jun 2017 23:50:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197029#M62319</guid>
      <dc:creator>arsalan_siddiqi</dc:creator>
      <dc:date>2017-06-04T23:50:59Z</dc:date>
    </item>
    <item>
      <title>Re: Update CSV attribute/Merge CSV files</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197030#M62320</link>
      <description>&lt;P&gt;Hello &lt;A rel="user" href="https://community.cloudera.com/users/13785/arsalan-siddiqi.html" nodeid="13785"&gt;@Arsalan Siddiqi&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I think it's possible with NiFi 1.2.0 or later, by using QueryRecord processor.&lt;/P&gt;&lt;P&gt;The basic idea is using MergeContent to create a single FlowFile, containing all CSV files, when doing so, add a column specifying which attribute (I used 'm' column in my example).&lt;/P&gt;&lt;P&gt;Then use QueryRecord processor to join records and produce a row having different attributes in it:&lt;/P&gt;&lt;PRE&gt;select fa.t, fa.v a, fb.v b, fc.v c
 from (
   select t, v from FLOWFILE where m = 'a'
 ) fa
 left join (
   select t, v from FLOWFILE where m = 'b'
 ) fb on fa.t = fb.t
 left join (
   select t, v from FLOWFILE where m = 'c'
 ) fc on fa.t = fc.t&lt;/PRE&gt;&lt;P&gt;I've created a Gist with NiFi flow template, I hope this helps:&lt;/P&gt;&lt;P&gt;&lt;A href="https://gist.github.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed" target="_blank"&gt;https://gist.github.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Jun 2017 13:11:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197030#M62320</guid>
      <dc:creator>kkawamura</dc:creator>
      <dc:date>2017-06-05T13:11:07Z</dc:date>
    </item>
    <item>
      <title>Re: Update CSV attribute/Merge CSV files</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197031#M62321</link>
      <description>&lt;P&gt;Thankyou for your reply. I see that you are using version 1.3.0 which I do not have. I tried to import the template but i get an error saying the UpdateRecord possessor is not known. Is the nar file available?  &lt;/P&gt;</description>
      <pubDate>Mon, 05 Jun 2017 16:16:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197031#M62321</guid>
      <dc:creator>arsalan_siddiqi</dc:creator>
      <dc:date>2017-06-05T16:16:10Z</dc:date>
    </item>
    <item>
      <title>Re: Update CSV attribute/Merge CSV files</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197032#M62322</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/13785/arsalan-siddiqi.html" nodeid="13785"&gt;@Arsalan Siddiqi&lt;/A&gt; I assumed UpdateRecord has been there since 1.2.0, but it's not. Sorry about that. Created another template which doesn't use UpdateRecord. Instead, I used another QueryRecord to update CSV. Confirmed it works with NiFi 1.2.0. Hope you will find it useful.&lt;/P&gt;&lt;P&gt;&lt;A href="https://gist.githubusercontent.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed/raw/e150884f52ca186dd61433428b38d172aaa7b128/Join_CSV_Files_1.2.0.xml" target="_blank"&gt;https://gist.githubusercontent.com/ijokarumawak/7e20af1cd222fb2adf13acb2b0f46aed/raw/e150884f52ca186dd61433428b38d172aaa7b128/Join_CSV_Files_1.2.0.xml&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jun 2017 06:54:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Update-CSV-attribute-Merge-CSV-files/m-p/197032#M62322</guid>
      <dc:creator>kkawamura</dc:creator>
      <dc:date>2017-06-06T06:54:25Z</dc:date>
    </item>
  </channel>
</rss>

