<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to remove the header when using NiFi SplitText processor in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227812#M63860</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11929/alvinuw.html" nodeid="11929"&gt;@Alvin Jin&lt;/A&gt; To answer your question about which processors to use: it depends on what you want to do with the whole CSV file. Your question only mentions splitting and ignoring the header, the CSVReader takes care of that. The record-aware processors in NiFi 1.3.0 include:&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-10-nar/1.3.0/org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord_0_10/index.html"&gt;ConsumeKafkaRecord_0_10&lt;/A&gt;: Gets messages from a Kafka topic, bundles into a single flow file instead of one per message&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.ConvertRecord/index.html"&gt;ConvertRecord&lt;/A&gt;: Converts records from one data format to another (Avro to JSON, e.g.)&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.LookupRecord/index.html"&gt;LookupRecord&lt;/A&gt;: Uses fields from a record to lookup a value, which can be added back to the record&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.PartitionRecord/index.html"&gt;PartitionRecord&lt;/A&gt;: Groups "like" records (based on user-provided criteria) into individual flow files&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-10-nar/1.3.0/org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_0_10/index.html"&gt;PublishKafkaRecord_0_10&lt;/A&gt;: Posts messages to a Kafka topic&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.PutDatabaseRecord/index.html"&gt;PutDatabaseRecord&lt;/A&gt;: Executes a specified operation (INSERT, UPDATE, DELETE, e.g.) on a database for each record in a flow file&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-elasticsearch-nar/1.3.0/org.apache.nifi.processors.elasticsearch.PutElasticsearchHttpRecord/index.html"&gt;PutElasticsearchHttpRecord&lt;/A&gt;: Executes a specified operation ("index", e.g.) on an Elasticsearch cluster for each record in a flow file&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.QueryRecord/index.html"&gt;QueryRecord&lt;/A&gt;: execute SQL queries on fields from the records. This can be used to filter, aggregate, etc.&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.SplitRecord/index.html"&gt;SplitRecord&lt;/A&gt;: Splits records into smaller flow files. Usually only used when downstream processors are not record-aware&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.UpdateRecord/index.html"&gt;UpdateRecord&lt;/A&gt;: Updates field(s) in each record of a flow file&lt;/P&gt;&lt;P&gt;Also I wanted to mention, if for some reason all your CSV columns are strings, you can set "Schema Access Strategy to "Use String Fields From Header", and then you don't need a schema or schema registry. Otherwise if you want to provide a schema, you're not required to use a schema registry, you can just paste your schema into the Schema Text property. and set "Schema Access Strategy" to "Use Schema Text Property".&lt;/P&gt;</description>
    <pubDate>Sat, 01 Jul 2017 00:11:09 GMT</pubDate>
    <dc:creator>mburgess</dc:creator>
    <dc:date>2017-07-01T00:11:09Z</dc:date>
    <item>
      <title>How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227809#M63857</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have a csv file with the first line as header.&lt;/P&gt;&lt;P&gt;When I use SplitText processor, the split tiny files contain that header as in first line.&lt;/P&gt;&lt;P&gt;Is there an easy way to generate the split file without header?&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jun 2017 03:18:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227809#M63857</guid>
      <dc:creator>alvinuw</dc:creator>
      <dc:date>2017-06-29T03:18:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227810#M63858</link>
      <description>&lt;P&gt;You could set the Header Line Count to 0, then send the flowfiles to a &lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.RouteOnAttribute/index.html"&gt;RouteOnAttribute&lt;/A&gt; processor where you can "skip" the first line by routing on the following Expression Language statement:&lt;/P&gt;&lt;PRE&gt;${fragment.index:gt(0)}&lt;/PRE&gt;&lt;P&gt;The first line will be routed to "unmatched" and the rest to "matched" or the user-defined property name (depending on the value of the Routing Strategy property). Note that this requires the Line Split Count property be set to 1 in SplitText.&lt;/P&gt;&lt;P&gt;Alternatively, if you are using (or can upgrade to) NiFi 1.3.0, you can use a record-aware processor with a &lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.3.0/org.apache.nifi.csv.CSVReader/index.html"&gt;CSVReader&lt;/A&gt;. This reader can be configured to (among other things) skip the header line. The record-aware processors also offer better performance when working with flow files that contain many "records" (such as a CSV file where each "record" is a row).&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jun 2017 03:32:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227810#M63858</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2017-06-29T03:32:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227811#M63859</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/641/mburgess.html" nodeid="641"&gt;@Matt Burgess&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thank you for your response.&lt;/P&gt;&lt;P&gt;The first solution works for me.&lt;/P&gt;&lt;P&gt;For the second solution, may I ask which processors should I use, since CSVReader is a service, which also requires schema and schema registry. &lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jun 2017 21:00:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227811#M63859</guid>
      <dc:creator>alvinuw</dc:creator>
      <dc:date>2017-06-30T21:00:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227812#M63860</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11929/alvinuw.html" nodeid="11929"&gt;@Alvin Jin&lt;/A&gt; To answer your question about which processors to use: it depends on what you want to do with the whole CSV file. Your question only mentions splitting and ignoring the header, the CSVReader takes care of that. The record-aware processors in NiFi 1.3.0 include:&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-10-nar/1.3.0/org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord_0_10/index.html"&gt;ConsumeKafkaRecord_0_10&lt;/A&gt;: Gets messages from a Kafka topic, bundles into a single flow file instead of one per message&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.ConvertRecord/index.html"&gt;ConvertRecord&lt;/A&gt;: Converts records from one data format to another (Avro to JSON, e.g.)&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.LookupRecord/index.html"&gt;LookupRecord&lt;/A&gt;: Uses fields from a record to lookup a value, which can be added back to the record&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.PartitionRecord/index.html"&gt;PartitionRecord&lt;/A&gt;: Groups "like" records (based on user-provided criteria) into individual flow files&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-10-nar/1.3.0/org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_0_10/index.html"&gt;PublishKafkaRecord_0_10&lt;/A&gt;: Posts messages to a Kafka topic&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.PutDatabaseRecord/index.html"&gt;PutDatabaseRecord&lt;/A&gt;: Executes a specified operation (INSERT, UPDATE, DELETE, e.g.) on a database for each record in a flow file&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-elasticsearch-nar/1.3.0/org.apache.nifi.processors.elasticsearch.PutElasticsearchHttpRecord/index.html"&gt;PutElasticsearchHttpRecord&lt;/A&gt;: Executes a specified operation ("index", e.g.) on an Elasticsearch cluster for each record in a flow file&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.QueryRecord/index.html"&gt;QueryRecord&lt;/A&gt;: execute SQL queries on fields from the records. This can be used to filter, aggregate, etc.&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.SplitRecord/index.html"&gt;SplitRecord&lt;/A&gt;: Splits records into smaller flow files. Usually only used when downstream processors are not record-aware&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.UpdateRecord/index.html"&gt;UpdateRecord&lt;/A&gt;: Updates field(s) in each record of a flow file&lt;/P&gt;&lt;P&gt;Also I wanted to mention, if for some reason all your CSV columns are strings, you can set "Schema Access Strategy to "Use String Fields From Header", and then you don't need a schema or schema registry. Otherwise if you want to provide a schema, you're not required to use a schema registry, you can just paste your schema into the Schema Text property. and set "Schema Access Strategy" to "Use Schema Text Property".&lt;/P&gt;</description>
      <pubDate>Sat, 01 Jul 2017 00:11:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227812#M63860</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2017-07-01T00:11:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227813#M63861</link>
      <description>&lt;P&gt;In my case, the csv columns are not all strings, there are long types.&lt;/P&gt;&lt;P&gt;Yes, I can provide schema text without using Schema Registry.&lt;/P&gt;&lt;P&gt;For your first solution, I think the index starts from 1.     ${fragment.index:gt(1)}&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Sat, 01 Jul 2017 01:14:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227813#M63861</guid>
      <dc:creator>alvinuw</dc:creator>
      <dc:date>2017-07-01T01:14:29Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227814#M63862</link>
      <description>&lt;P&gt;SplitText for some reason starts the index at 1, the other Split processors start at 0. Sorry I had forgotten that difference, good catch!&lt;/P&gt;</description>
      <pubDate>Sat, 01 Jul 2017 03:04:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/227814#M63862</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2017-07-01T03:04:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/308239#M63863</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38301"&gt;@mburgess&lt;/a&gt;&amp;nbsp; &amp;amp; &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/55910"&gt;@alvinuw&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Currently i want to load the txt file(not csv) into postgres. i want to remove the header for the txt file&lt;/P&gt;&lt;P&gt;I have use this processors (ListenFile-FetchFile-Splitext-RouteOnAttribut and ReplaceText(for regex).I try your propose but it's no okay for me&lt;/P&gt;&lt;P&gt;please can you did me what i doing&lt;/P&gt;&lt;DIV class="mceNonEditable lia-copypaste-placeholder"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;P&gt;you find Attached the screenshot&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="RemoveHeader.PNG" style="width: 746px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29864i4351AEE6E89A1BB7/image-size/large?v=v2&amp;amp;px=999" role="button" title="RemoveHeader.PNG" alt="RemoveHeader.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Dec 2020 01:46:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/308239#M63863</guid>
      <dc:creator>Lamtoro</dc:creator>
      <dc:date>2020-12-23T01:46:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove the header when using NiFi SplitText processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/325721#M63864</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38301"&gt;@mburgess&lt;/a&gt;&amp;nbsp;I used your 1st suggestion and it worked like a charm with just one exception. The header row was index 1. I'm not sure if was just me, my data, or some property/attribute I set wrong. Just thought you should know. So, after modifying the user-defined attribute value to ${fragment.index:gt(1)} it worked. And, in case you ask, the header row is the first row in the CSV file which doesn't make sense unless the processor logic changed to 1-based indexing instead of 0-based indexing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also, thanks for all of your blog posts. I use your suggestions a lot.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Sep 2021 16:06:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-remove-the-header-when-using-NiFi-SplitText-processor/m-p/325721#M63864</guid>
      <dc:creator>alencosoft</dc:creator>
      <dc:date>2021-09-29T16:06:16Z</dc:date>
    </item>
  </channel>
</rss>

