Created 11-28-2024 01:58 AM
My NiFi flow fails when encountering a CSV with a column containing a double quote within a string, such as:
"Protection from Abuse Order on file against dad Raul Martinez Lopez." NO CONTACT WITH DAD. 12/16/2014 kb.
The error is occurring at the Record Reader stage. Has anyone else successfully handled CSV data with embedded double quotes?
From csv:
My record reader config
Created 11-28-2024 06:07 AM
Hi,
First, if the data you have posted contain real personal info I would recommend to remove and use some dummy data instead. Its violation of community guidelines to post personal information (see point 7 of community guidelines).
In regards to the error: you are getting it because of the property setting Quote Character = " in the CSVReader service. What this setting means is that when you have sentence that has once of the reserved CSV characters like comma (,) as column separator and new line (\n) to separate records where you dont\cant use the escape character (\), then you can surround the whole column value with double quotes at both ends. This means you should not have any following character for the same column. For more info please refer to : https://csv-loader.com/csv-guide/why-quotation-marks-are-used-in-csv
Since the line you have listed has following characters after the closing " , you are getting the illegal character error.
To Resolve:
You have two options:
1- Use Replace Text to replace any double quote " character with \" to escape the double quote. However this might not be so efficient if you have large CSV file.
2- More efficient option, is to replace the Quote Character in the CSVReader with something other than " , however you have to make sure that your data is not going to contain the new character in any of the CSV values. Possible options: $,%,^
If this helps please accept the solution.
Thanks
Created 11-28-2024 06:07 AM
Hi,
First, if the data you have posted contain real personal info I would recommend to remove and use some dummy data instead. Its violation of community guidelines to post personal information (see point 7 of community guidelines).
In regards to the error: you are getting it because of the property setting Quote Character = " in the CSVReader service. What this setting means is that when you have sentence that has once of the reserved CSV characters like comma (,) as column separator and new line (\n) to separate records where you dont\cant use the escape character (\), then you can surround the whole column value with double quotes at both ends. This means you should not have any following character for the same column. For more info please refer to : https://csv-loader.com/csv-guide/why-quotation-marks-are-used-in-csv
Since the line you have listed has following characters after the closing " , you are getting the illegal character error.
To Resolve:
You have two options:
1- Use Replace Text to replace any double quote " character with \" to escape the double quote. However this might not be so efficient if you have large CSV file.
2- More efficient option, is to replace the Quote Character in the CSVReader with something other than " , however you have to make sure that your data is not going to contain the new character in any of the CSV values. Possible options: $,%,^
If this helps please accept the solution.
Thanks
Created 11-28-2024 08:27 PM
I tried to delete the data you mentioned, but I don't know how to edit the topic. Thank you very much for your support.