<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFI - Converting CSV to Avro, header contains spaces in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212245#M78871</link>
    <description>&lt;P&gt;One way would be to define the schema ahead of time in one of the schema registries, and then have your CSVReader's Schema Access Strategy set to "Schema Name" so that it uses the schema from the registry, and then tell it to ignore the first line of the CSV. The downside is you have to define the schema rather than just using the column headers.&lt;/P&gt;&lt;P&gt;Besides that, the next easiest option would probably be to use ExecuteScript to write a simple script that reads the first line and converts the spaces in the column names to underscores, and then wrote it back out converted along with all the other unmodified lines.&lt;/P&gt;&lt;P&gt;It is possible there might be a way to do it with ReplaceText, but I'm not exactly sure how to modify only the first line.&lt;/P&gt;</description>
    <pubDate>Wed, 30 May 2018 02:49:57 GMT</pubDate>
    <dc:creator>bbende</dc:creator>
    <dc:date>2018-05-30T02:49:57Z</dc:date>
    <item>
      <title>NiFI - Converting CSV to Avro, header contains spaces</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212244#M78870</link>
      <description>&lt;P&gt;I've had pretty good success converting csv to json and avro using the ConvertRecord processor.&lt;/P&gt;&lt;P&gt;However I'm having issues converting a csv file with spaces in the header (column names)&lt;/P&gt;&lt;P&gt;Ex CSV:&lt;/P&gt;&lt;P&gt;"Date of Birth"&lt;/P&gt;&lt;P&gt;01-23-1981&lt;/P&gt;&lt;P&gt;Is there a way to replace the spaces ' ' with '_' on just the header row?  Is there another way to handle column/field names with spaces when using the ConvertRecord procesors when converting to avro?&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 02:02:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212244#M78870</guid>
      <dc:creator>hyuen</dc:creator>
      <dc:date>2018-05-30T02:02:17Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI - Converting CSV to Avro, header contains spaces</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212245#M78871</link>
      <description>&lt;P&gt;One way would be to define the schema ahead of time in one of the schema registries, and then have your CSVReader's Schema Access Strategy set to "Schema Name" so that it uses the schema from the registry, and then tell it to ignore the first line of the CSV. The downside is you have to define the schema rather than just using the column headers.&lt;/P&gt;&lt;P&gt;Besides that, the next easiest option would probably be to use ExecuteScript to write a simple script that reads the first line and converts the spaces in the column names to underscores, and then wrote it back out converted along with all the other unmodified lines.&lt;/P&gt;&lt;P&gt;It is possible there might be a way to do it with ReplaceText, but I'm not exactly sure how to modify only the first line.&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 02:49:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212245#M78871</guid>
      <dc:creator>bbende</dc:creator>
      <dc:date>2018-05-30T02:49:57Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI - Converting CSV to Avro, header contains spaces</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212246#M78872</link>
      <description>&lt;P&gt;Adding to Bryan's answer, if you have the schema available to put in the registry, you can set it to Validate Field Names to false, meaning you could have field names defined in the Avro schema that do not conform to the stricter Avro rules.&lt;/P&gt;&lt;P&gt;We should consider adding this property to readers that generate their own schema, such as CSVReader...&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 10:06:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212246#M78872</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2018-05-30T10:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI - Converting CSV to Avro, header contains spaces</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212247#M78873</link>
      <description>&lt;P&gt;If you use an "invalid" schema will it be able to write to avro?  I can see how that could work for transforming from csv to json - but I don't think it will work for avro,  due to the rules.&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 21:30:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212247#M78873</guid>
      <dc:creator>hyuen</dc:creator>
      <dc:date>2018-05-30T21:30:44Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI - Converting CSV to Avro, header contains spaces</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212248#M78874</link>
      <description>&lt;P&gt;Yeah that's true, I misread the first sentence of your question and was thinking conversion to JSON only, my bad&lt;/P&gt;</description>
      <pubDate>Thu, 31 May 2018 04:58:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Converting-CSV-to-Avro-header-contains-spaces/m-p/212248#M78874</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2018-05-31T04:58:41Z</dc:date>
    </item>
  </channel>
</rss>

