Created 08-25-2023 08:25 AM
Hi all. I can't figure out why the MergeRecord processor converts fractional values to integers. Reading origin from json file.
Created 08-29-2023 01:59 AM
Problem solved by setting Decimal() type in JoltTransformJSON"stake" processor: ${stake:toDecimal()}
Created 08-25-2023 08:33 AM
Example : "stake" : 0.5 ---> "stake" : 0 or 1
Created 08-25-2023 09:25 AM
@Dim Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @cotopaul @MattWho who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 08-28-2023 12:46 AM
@Dim, I do not think that MergeRecord is the one doing this action, rather the schema you have defined in both your RecordReader and your RecordWriter.
I am working for example with streaming data in Parquet and AVRO Format, using MergeRecord three times during the flow and each fractional remains fractional, because I set the RecordWriter to have a schema which accepts fractional data.
I suggest you to take a look in the schema you have defined within your Controller Services and start debugging from there 🙂
Besides that, your problem might start from a different location. You need to check your flow from start to end and see if you are working with correct scale and precision and if the data types are correct or not.
Created on 08-28-2023 01:34 AM - edited 08-28-2023 01:35 AM
Thanks for the hint on the schema. I am using JsonTreeReader and JsonRecordSetWriter as standard. What schema options can be used to make MergeRecord work?
Created 08-28-2023 01:57 AM
Well first of all, how does the data look like before entering MergeRecord?
Secondly, how did you configure both the Reader and the Writer? You pasted the configuration for MergeRecord, where this has nothing to do with how the data gets transformed.
Created 08-28-2023 02:42 AM
Before
{
"UUID_FlowFile" : "0e543142-a8f1-48b4-84b6-b65a85e4708a",
"update_date" : "2023-08-28 09:26:52",
"employee_code" : "0000000121",
"employee_id" : "9bf2cf03-19d4-11ea-8533-00155d0b352f",
"orgunit_id" : null,
"stake" : 0.5,
}
Before
{
"UUID_FlowFile" : "0e543142-a8f1-48b4-84b6-b65a85e4708a",
"update_date" : "2023-08-28 09:26:52",
"employee_code" : "0000000121",
"employee_id" : "9bf2cf03-19d4-11ea-8533-00155d0b352f",
"orgunit_id" : null,
"stake" : 0,
}
Reader
Writer
Created 08-28-2023 06:05 AM
Well I am not near a PC to test right now, but my initial thoughts are that the problem is related to how your raw data is coming in your flow.
As I can see, you have both an INT value and a FLOAT Value .... and not a constant data type:
"stake" : 0,
"stake" : 0.5,
Now, you set your Schema Access Strategy to Inherit Record Schema.This is correct in most cases, but in your case it is not, because your data is not stable.
If two files are going into your MergeRecord and one has the value 0 and one has the value 0,5, you will have two different schemas, meaning that the files cannot be merged accordingly. If the first file comes as an INT, your second flowfile (or all the others coming right after) will automatically be converted to an INT value, no matter their value.
To avoid this, you will have to generate the schema manually and change your RecordReader and your RecordWritter from Inherit Record Schema to "Use Schema Text Property" and define your schema manually in the new field (which will appear upon the switch).
Make sure that in your schema that field is defined with a data type which accepts fractional data and not just an int value.
Created 08-28-2023 02:52 AM
I note that there are cases when the union occurs normally. But the dependency is not installed.
Created 08-29-2023 01:59 AM
Problem solved by setting Decimal() type in JoltTransformJSON"stake" processor: ${stake:toDecimal()}