Created 12-08-2017 10:01 PM
Hi All,
Am new to Nifi and trying to solve an issue am facing with Avro to Json and Json to Avro conversion using the nifi ConvertAvroToJson and ConvertJsonToAvro processors. Everything is working fine and dandy when I am using string or int or boolean datatypes, however, when I am throwing a long decimal number into the schema, its refusing to work.
For
1. ConvertAvroToJson - Am receiving Array Index Out Of Bounds exception when I use the a avro schema.
2. ConvertJsonToAvro - receiving Failed to convert 1/1 records from Json to Avro(am running for one record to test)
The processors I am using is straight forward and it goes like this -
1. ExecuteSQL(from MSSQL database) > ConvertAvroToJson(I am getting the above error when I mention the AVRO schema) > PutFile
2. ExecuteSQL(from MSSQL database) > ConvertAvroToJson(not using any schema in avro schema property) > ConvertJsonToAvro(here am mentioning the AVRO schema) > PutFile
Following is the Avro schema format I am using for Decimal logical type-
{ "type":"record", "name":"test", "namespace":"any.data", "fields": [ { "name":"field1", "type":[ "null","string" ] },{ "name":"field2", "type":[ "null", { "type":"bytes", "logicalType":"decimal", "precision":32, "scale":10 } ] }, { "name":"field3", "type":[ "null","boolean" ] }] }
I've tried different versions of schema but its still not working. The one row am pulling has the field2 value as 37.8531000000 .
PS - Am using all of this because I want to use DetectDuplicates. So, I am able to use the flows like - ExecuteSQL > ConvertAvroToJson > EvaluateJsonPath > DetectDuplicate > ConvertJsonToAvro > ... {Use more processors to do validations}... > PutFile. Using detect duplicate I am basically limiting ExecuteSQL to write only one file by setting the AgeOffDuration to 5 hours (we're just doing batch processing for now), instead of ExecuteSQL writing multiple files.
Really appreciate your help!!
Created 12-09-2017 02:21 AM
What version of NiFi are you using? Check NIFI-3000 for a history of what's been done and what hasn't been done. Depending on your version, you will likely want to switch to record-aware processors such as ConvertRecord, as they support logical types as of NIFI-2624 where some other processors may not. You may also be able to leverage PartitionRecord to help with grouping the same values, or QueryRecord (with LIMIT 1 perhaps) to help with duplicate detection/elimination.
Created 12-12-2017 07:25 PM
Thanks so much for the reply and details, I removed the ConvertAvroToJSON and ConvertJSONToAvro and replaced them with ConvertRecord and it worked fine!
However, After conversion to JSON, the decimal numbers are missing the trailing zeroes. For e.g. if the number in the database is 37.8531000000 then the number is getting stripped down to 37.8531 and removing the trailing zeroes. I know it should not matter, but still we're trying to pull data as it is without any changes, so just wanted to know if there is any way to retain them?
Thanks again,
Appreciate your help!
Created 08-03-2022 07:19 AM
Try specifying schema with precision ex: {"type":"record","name":"schema","fields":[{"name":"_COL_0","type":{"type":"fixed","name":"_COL_0","size":16,"logicalType":"decimal","precision":38,"scale":0}}}