Member since
06-21-2017
2
Posts
1
Kudos Received
0
Solutions
03-12-2018
01:40 PM
Thanks, Pierre! Glad to help, and I'm especially grateful for the quick turnaround time. I will switch to the explicit schema definition for now while we still have only a few files (and subsequently schemas) to validate. Ideally in the future we'll be able to use this when we have a large number of schemas coming through. Cheers!
... View more
03-08-2018
09:41 PM
1 Kudo
Hi, everyone, I am currently working with the ValidateRecord processor in Nifi to test its capabilities & see if it's fit for a task I have. One step I want my flow to have is to be able to validate the format of a CSV file before placing it in HDFS for further processing (using Hive and other methods). The ValidateRecord processor does exactly what I need it to do, except... What I'm expecting the processor to do is read the CSV data, verify the format & filter out any bad rows, and create a FlowFile with columns in the same order. However, after the ValidateRecord block runs, the columns are rearranged, for reasons that I cannot quite understand. I can get back to the original column ordering by using the ConvertRecord processor, but I was wondering if this is a necessary step in order to get back the original column order or if there's something I'm missing when using the ValidateRecord block? Potentially relevant information:
Running Nifi Version 1.5.0 Using an AvroSchemaRegistry with CSVReader and CSVRecordSetWriter in the ValidateRecord block Would prefer to keep the data as raw text as much as possible, as further processes do additional formatting & clean up Columns seem to be in an arbitrary order when the file leaves the ValidateRecord block (i.e., the column names aren't sorted alphabetically, by the length of the field, etc.) Thanks!
... View more
Labels:
- Labels:
-
Apache NiFi