<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ValidateRecord  doesn't maintain column order? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183220#M75570</link>
    <description>&lt;P&gt;Thanks, Pierre! Glad to help, and I'm especially grateful for the quick turnaround time.&lt;/P&gt;&lt;P&gt;I will switch to the explicit schema definition for now while we still have only a few files (and subsequently schemas) to validate. Ideally in the future we'll be able to use this when we have a large number of schemas coming through. &lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;</description>
    <pubDate>Mon, 12 Mar 2018 20:40:09 GMT</pubDate>
    <dc:creator>jessica_david</dc:creator>
    <dc:date>2018-03-12T20:40:09Z</dc:date>
    <item>
      <title>ValidateRecord  doesn't maintain column order?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183218#M75568</link>
      <description>&lt;P&gt;Hi, everyone,&lt;/P&gt;&lt;P&gt;I am currently working with the ValidateRecord processor in Nifi to test its capabilities &amp;amp; see if it's fit for a task I have. One step I want my flow to have is to be able to validate the format of a CSV file before placing it in HDFS for further processing (using Hive and other methods). The ValidateRecord processor does exactly what I need it to do, except...&lt;/P&gt;&lt;P&gt;What I'm expecting the processor to do is read the CSV data, verify the format &amp;amp; filter out any bad rows, and create a FlowFile with columns in the same order. However, after the ValidateRecord block runs, the columns are rearranged, for reasons that I cannot quite understand. I can get back to the original column ordering by using the ConvertRecord processor, but I was wondering if this is a necessary step in order to get back the original column order or if there's something I'm missing when using the ValidateRecord block?&lt;/P&gt;&lt;P&gt;Potentially relevant information:&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;Running Nifi Version 1.5.0&lt;/LI&gt;&lt;LI&gt;Using an AvroSchemaRegistry with CSVReader and CSVRecordSetWriter in the ValidateRecord block&lt;/LI&gt;&lt;LI&gt;Would prefer to keep the data as raw text as much as possible, as further processes do additional formatting &amp;amp; clean up&lt;/LI&gt;&lt;LI&gt;Columns seem to be in an arbitrary order when the file leaves the ValidateRecord block (i.e., the column names aren't sorted alphabetically, by the length of the field, etc.)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 09 Mar 2018 05:41:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183218#M75568</guid>
      <dc:creator>jessica_david</dc:creator>
      <dc:date>2018-03-09T05:41:58Z</dc:date>
    </item>
    <item>
      <title>Re: ValidateRecord  doesn't maintain column order?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183219#M75569</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/19224/jessicadavid.html" nodeid="19224"&gt;@Jessica David&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;I confirm this is a bug. I created a JIRA for that: &lt;A href="https://issues.apache.org/jira/browse/NIFI-4955" target="_blank"&gt;https://issues.apache.org/jira/browse/NIFI-4955&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I will submit a fix in a minute. Thanks for reporting the issue.&lt;/P&gt;&lt;P&gt;I assume you're using the header as the schema access strategy in the CSV Reader. If you're able to use a different strategy (schema name, or schema text), it should solve the problem even though you need to explicitly define the schema.&lt;/P&gt;</description>
      <pubDate>Sat, 10 Mar 2018 01:31:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183219#M75569</guid>
      <dc:creator>pvillard</dc:creator>
      <dc:date>2018-03-10T01:31:59Z</dc:date>
    </item>
    <item>
      <title>Re: ValidateRecord  doesn't maintain column order?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183220#M75570</link>
      <description>&lt;P&gt;Thanks, Pierre! Glad to help, and I'm especially grateful for the quick turnaround time.&lt;/P&gt;&lt;P&gt;I will switch to the explicit schema definition for now while we still have only a few files (and subsequently schemas) to validate. Ideally in the future we'll be able to use this when we have a large number of schemas coming through. &lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;</description>
      <pubDate>Mon, 12 Mar 2018 20:40:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/ValidateRecord-doesn-t-maintain-column-order/m-p/183220#M75570</guid>
      <dc:creator>jessica_david</dc:creator>
      <dc:date>2018-03-12T20:40:09Z</dc:date>
    </item>
  </channel>
</rss>

