<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to extract csv column record and used it for  file name and sheet name ? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309063#M223755</link>
    <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;I have a scenario where I will get a number of records from csv file, the task is to read the csv and split each record as one file and save the each file and sheet name with the record name of the 1st column(exclude 1st row as header).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. What I have done is&amp;nbsp; read the csv file using Getfile,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. then used splittext processor, to split each record as one csv file by setting property Header line count to 1.&lt;/P&gt;
&lt;P&gt;3. then need to extract the 1st record at column1, use that record(2nd row and 2nd column) value as file name and sheet name for the each individual file&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Original csv file:&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="murali2425_0-1610040181706.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30054i730C9F53C8B49561/image-size/medium?v=v2&amp;amp;px=400" role="button" title="murali2425_0-1610040181706.png" alt="murali2425_0-1610040181706.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;after split the there should two files, one with the name ab123.csv and c35ks.csv and also sheet name also should be changed.&lt;/P&gt;
&lt;TABLE width="598"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="64"&gt;ID&lt;/TD&gt;
&lt;TD width="338"&gt;Description&lt;/TD&gt;
&lt;TD width="196"&gt;status&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;ab123&lt;/TD&gt;
&lt;TD&gt;Eldon Base for stackable storage shelf, platinum&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;STRONG&gt;ab123.csv&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="598"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="63px" height="30px"&gt;ID&lt;/TD&gt;
&lt;TD width="338px" height="30px"&gt;Description&lt;/TD&gt;
&lt;TD width="196px" height="30px"&gt;status&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="63px" height="57px"&gt;c35ks&lt;/TD&gt;
&lt;TD width="338px" height="57px"&gt;1.7 Cubic Foot Compact "Cube" Office Refrigerators&lt;/TD&gt;
&lt;TD width="196px" height="57px"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;STRONG&gt;c35ks.csv&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How to get above out puts after the work flow.&lt;/P&gt;</description>
    <pubDate>Fri, 08 Jan 2021 07:08:57 GMT</pubDate>
    <dc:creator>murali2425</dc:creator>
    <dc:date>2021-01-08T07:08:57Z</dc:date>
    <item>
      <title>How to extract csv column record and used it for  file name and sheet name ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309063#M223755</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;I have a scenario where I will get a number of records from csv file, the task is to read the csv and split each record as one file and save the each file and sheet name with the record name of the 1st column(exclude 1st row as header).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. What I have done is&amp;nbsp; read the csv file using Getfile,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. then used splittext processor, to split each record as one csv file by setting property Header line count to 1.&lt;/P&gt;
&lt;P&gt;3. then need to extract the 1st record at column1, use that record(2nd row and 2nd column) value as file name and sheet name for the each individual file&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Original csv file:&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="murali2425_0-1610040181706.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30054i730C9F53C8B49561/image-size/medium?v=v2&amp;amp;px=400" role="button" title="murali2425_0-1610040181706.png" alt="murali2425_0-1610040181706.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;after split the there should two files, one with the name ab123.csv and c35ks.csv and also sheet name also should be changed.&lt;/P&gt;
&lt;TABLE width="598"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="64"&gt;ID&lt;/TD&gt;
&lt;TD width="338"&gt;Description&lt;/TD&gt;
&lt;TD width="196"&gt;status&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;ab123&lt;/TD&gt;
&lt;TD&gt;Eldon Base for stackable storage shelf, platinum&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;STRONG&gt;ab123.csv&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="598"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="63px" height="30px"&gt;ID&lt;/TD&gt;
&lt;TD width="338px" height="30px"&gt;Description&lt;/TD&gt;
&lt;TD width="196px" height="30px"&gt;status&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="63px" height="57px"&gt;c35ks&lt;/TD&gt;
&lt;TD width="338px" height="57px"&gt;1.7 Cubic Foot Compact "Cube" Office Refrigerators&lt;/TD&gt;
&lt;TD width="196px" height="57px"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;STRONG&gt;c35ks.csv&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How to get above out puts after the work flow.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Jan 2021 07:08:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309063#M223755</guid>
      <dc:creator>murali2425</dc:creator>
      <dc:date>2021-01-08T07:08:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract csv column record and used it for  file name and sheet name ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309175#M223764</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;i tried a solution for you:&lt;/P&gt;&lt;P&gt;1) GenerateFlowFile&lt;/P&gt;&lt;P&gt;Its your GetFile Processor to get the csv file&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2) ConvertRecord&lt;/P&gt;&lt;P&gt;Convert with&amp;nbsp;&lt;SPAN&gt;CSVReader to&amp;nbsp;JsonRecordSetWriter&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;3) SplitJson&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Split each bbject&amp;nbsp;(csv row) with&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;$.*&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;as path&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;4) EvaluateJsonPath&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Add dynamicilly property with name&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;filename&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;and value&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;$.ID&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;to get the ID as filename on flowfile attribute&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;5) UpdateAttribute&lt;/P&gt;&lt;P&gt;Add type of file to filename attribute value&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;${filename:append('.csv')}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;6) ConvertRecord&lt;/P&gt;&lt;P&gt;Now is the question how to work..&lt;/P&gt;&lt;P&gt;You can convert json back to csv or you are working with wait/notify, so that you can overhand your "filename" attribute to your splitted csv flowfile..&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="How to extract csv column record and used it for file name and sheet name.png" style="width: 0px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30062i00B8B6125903B3EF/image-size/large?v=v2&amp;amp;px=999" width="0" height="0" role="button" title="How to extract csv column record and used it for file name and sheet name.png" alt="How to extract csv column record and used it for file name and sheet name.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="How to extract csv column record and used it for file name and sheet name.png" style="width: 160px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30064i6AEDB371A0C1C8E8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="How to extract csv column record and used it for file name and sheet name.png" alt="How to extract csv column record and used it for file name and sheet name.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Jan 2021 12:10:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309175#M223764</guid>
      <dc:creator>Faerballert</dc:creator>
      <dc:date>2021-01-08T12:10:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract csv column record and used it for  file name and sheet name ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309184#M223768</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/72826"&gt;@murali2425&lt;/a&gt;&amp;nbsp; &amp;nbsp;The solution you are looking for is QueryRecord configured with a CSV Record Reader and Record Writer.&amp;nbsp; &amp;nbsp;You also have UpdateRecord and ConvertRecord which can use the Readers/Writers.&amp;nbsp; This method is preferred over splitting the file and adds some nice functionality.&amp;nbsp; This method allows you to provide a schema for both the inbound csv (reader) and the downstream csv (writer).&amp;nbsp; &amp;nbsp;Using QueryRecord you should be able to split the file, and set attribute of filename set to column1.&amp;nbsp; At the end of the flow you should be able to leverage that filename attribute to resave the new file.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can find some specific examples and configuration screen shots here:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/Running-SQL-on-FlowFiles-using-QueryRecord-Processor-Apache/ta-p/246671" rel="nofollow noreferrer" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/Running-SQL-on-FlowFiles-using-QueryRecord-Processor-Apache/ta-p/246671&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Steven&lt;/P&gt;</description>
      <pubDate>Fri, 08 Jan 2021 13:23:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-extract-csv-column-record-and-used-it-for-file-name/m-p/309184#M223768</guid>
      <dc:creator>stevenmatison</dc:creator>
      <dc:date>2021-01-08T13:23:57Z</dc:date>
    </item>
  </channel>
</rss>

