Support Questions

celestial1122 · ‎02-22-2022

Hi I'm using nifi 1.11.4 and I came across the following issue:

I'm dynamically reading a CSV file that contains 2 headers with the same name ( can't use avro schema registry as I don't the header names in advance). When I'm trying to use CSVReader to read the file "I'm getting duplicate header error". How can I handle this situation?

Note: I need to maintain the same header name.

araujo · ‎02-22-2022

@celestial1122 ,

If you could provide a sample of the input data you have and the expected output, it would help.

But in general, once NiFi converts the CSV data into flowfile records, the record cannot have duplicated column names. If you want to keep both columns' values, you must rename one of them to a different name. You could do that, for example, using a ReplaceText processor.

If you could provide examples, we could probably help with more ideas.

Regards,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

araujo · ‎02-22-2022

Actually, an easier way to ignore the column name duplication and still process the columns correctly, would be to use a schema to describe your data.

For example, say you have the following CSV:

col_a,col_b,col_b
1,2,3
4,5,6

You can configure your CSVReader with the following:

And the data will be processed correctly:

HTH,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Cloudera Community

Support Questions

Nifi - CSV with duplicate headers