Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I want to generate Avro Schema from CSV file using Kite SDK in java

avatar
New Contributor

I want to generate Avro Schema from CSV file using Kite SDK in java.

is there any way that we can do it? 

3 REPLIES 3

avatar
Super Mentor

@Raj123 

 

NiFi offers many "record" based processors that support various record readers and writers.  
Those record readers have the ability of inferring an avro schema from the incoming record and the record writer can be configured to write the inferred schema to an attribute on the outgoing FlowFile.

There is no specific infer schema processor for CSV source data.  That would require a custom processor (perhaps one that utilizes the existing CSVReader controller service.  

Typically you would use a record based processor to manipulate, split, validate your record, so I am not the value or use case fro only wanting to infer the avro schema.

That being said, you can get that inferred schema for example by simply using the "ConvertRecord" processor with a "CSVReader" (configured to infer schema) and a "CSVRecordSetWriter" (configured to "set avro.schema' attribute").  The written FlowFile will be same as source FlowFile but it will have an additional "avro.schema" attribute on the FlowFile containing the inferred avro schema.

ConvertRecord:
Screen Shot 2021-01-08 at 4.09.24 PM.png

CSVReader:
Screen Shot 2021-01-08 at 4.10.15 PM.png
CSVRecordSetWriter:
Screen Shot 2021-01-08 at 4.10.32 PM.png

 

Hope this helps,

Matt

avatar
New Contributor

Thanks for your answer,

can we do it with JAVA using nifi libraries,

if yes, any sample code of it

avatar
Super Mentor

@Raj123 

I am not a java developer, but NiFi is written in Java and the source code is open sourced.
You would need to look at the code for the CSVReader to see how it handles AVRO schema inference.
Sorry that I cannot be of more help in this specific query.