Created 01-08-2021 04:24 AM
I want to generate Avro Schema from CSV file using Kite SDK in java.
is there any way that we can do it?
Created 01-08-2021 01:12 PM
NiFi offers many "record" based processors that support various record readers and writers.
Those record readers have the ability of inferring an avro schema from the incoming record and the record writer can be configured to write the inferred schema to an attribute on the outgoing FlowFile.
There is no specific infer schema processor for CSV source data. That would require a custom processor (perhaps one that utilizes the existing CSVReader controller service.
Typically you would use a record based processor to manipulate, split, validate your record, so I am not the value or use case fro only wanting to infer the avro schema.
That being said, you can get that inferred schema for example by simply using the "ConvertRecord" processor with a "CSVReader" (configured to infer schema) and a "CSVRecordSetWriter" (configured to "set avro.schema' attribute"). The written FlowFile will be same as source FlowFile but it will have an additional "avro.schema" attribute on the FlowFile containing the inferred avro schema.
ConvertRecord:
CSVReader:
CSVRecordSetWriter:
Hope this helps,
Matt
Created 01-10-2021 11:04 PM
Thanks for your answer,
can we do it with JAVA using nifi libraries,
if yes, any sample code of it
Created 01-12-2021 05:34 AM
@Raj123
I am not a java developer, but NiFi is written in Java and the source code is open sourced.
You would need to look at the code for the CSVReader to see how it handles AVRO schema inference.
Sorry that I cannot be of more help in this specific query.