Created 07-08-2018 05:43 AM
I am using ExecuteSQL to extract data from Oracle. While retrieving the records, data type of NUMBER is getting converted to BYTES. So to keep the Number to long/double I am using "ConvertRecord" processor.
In "Convert Record" processor, I choose AvroReader using embedded schema and for AvroRecordSerWriter used external schema to change the bytes datatypes to long/int. But once the processor executes, I saw that the schema is getting lost as I am not able to convert to orc in order to create Hive table. Is this is any kind of bug?
Created 07-08-2018 09:52 AM
In AvroRecordSetWriter controller service you need to select
Schema Write Strategy property value
Embed Avro Schema
So that you are writing the new schema embed to the avro data file, when you use ConvertAVROToOrc processor there will be no issues when the schema was embedded. We are going to get issues java.io.IOException: Not a data file only when the processors are not able to find any schema in the avro data file.
Created 07-08-2018 09:52 AM
In AvroRecordSetWriter controller service you need to select
Schema Write Strategy property value
Embed Avro Schema
So that you are writing the new schema embed to the avro data file, when you use ConvertAVROToOrc processor there will be no issues when the schema was embedded. We are going to get issues java.io.IOException: Not a data file only when the processors are not able to find any schema in the avro data file.
Created 07-08-2018 07:42 PM
Thanks Shu for the response. Here I need to provide my own schema. I created the schema in Schema registry and trying to apply the same. My intention is to convert the datatype(from binary to Long)..
Created on 07-08-2018 08:36 PM - edited 08-18-2019 12:51 AM
Schema Write Strategy is used to define schema i.e. do we need to add a schema.name attribute (or) Embed Avro Schema(this is newly defined schema in AvroSchemaRegistry in your case ) in a data file (or) etc...
Add schema.name attribute to the flowfile that matches the avro schema registry name and convert record processor writes the schema that you have mentioned in the avro schema registry(i.e long type) in the output flowfile from ConvertRecord procesor.
AvroSetWriter controller service configs:
With these configs you are going to have new avro data file with AvroSchemaRegistry schema embed in it.
Created 07-11-2018 10:53 AM
Thanks Shu..