Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Avro Schema change using COnvertRecord processor in Nifi

avatar
Explorer

I am using ExecuteSQL to extract data from Oracle. While retrieving the records, data type of NUMBER is getting converted to BYTES. So to keep the Number to long/double I am using "ConvertRecord" processor.

In "Convert Record" processor, I choose AvroReader using embedded schema and for AvroRecordSerWriter used external schema to change the bytes datatypes to long/int. But once the processor executes, I saw that the schema is getting lost as I am not able to convert to orc in order to create Hive table. Is this is any kind of bug?

1 ACCEPTED SOLUTION

avatar
Master Guru
@Ishan Kumar

In AvroRecordSetWriter controller service you need to select

Schema Write Strategy property value

Embed Avro Schema

So that you are writing the new schema embed to the avro data file, when you use ConvertAVROToOrc processor there will be no issues when the schema was embedded. We are going to get issues java.io.IOException: Not a data file only when the processors are not able to find any schema in the avro data file.

View solution in original post

4 REPLIES 4

avatar
Master Guru
@Ishan Kumar

In AvroRecordSetWriter controller service you need to select

Schema Write Strategy property value

Embed Avro Schema

So that you are writing the new schema embed to the avro data file, when you use ConvertAVROToOrc processor there will be no issues when the schema was embedded. We are going to get issues java.io.IOException: Not a data file only when the processors are not able to find any schema in the avro data file.

avatar
Explorer

Thanks Shu for the response. Here I need to provide my own schema. I created the schema in Schema registry and trying to apply the same. My intention is to convert the datatype(from binary to Long)..

avatar
Master Guru
@Ishan Kumar

Schema Write Strategy is used to define schema i.e. do we need to add a schema.name attribute (or) Embed Avro Schema(this is newly defined schema in AvroSchemaRegistry in your case ) in a data file (or) etc...

Add schema.name attribute to the flowfile that matches the avro schema registry name and convert record processor writes the schema that you have mentioned in the avro schema registry(i.e long type) in the output flowfile from ConvertRecord procesor.

AvroSetWriter controller service configs:

80430-avrosetwriter.png

With these configs you are going to have new avro data file with AvroSchemaRegistry schema embed in it.

avatar
Explorer

Thanks Shu..