- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Nifi JsonRecordSetWriter 1.17.0 Corrupting Data
- Labels:
-
Apache NiFi
Created ‎01-29-2023 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello guys,
I'm struggling with JsonRecordSetWriter 1.17.0, we have valid nested JSON data, but for some reason whenever this data passes through JsonRecordSetWriter, on nested arrays it adds the following :
"orderItemsCustomizes" : [ "MapRecord[{quantity=1, itemCurrency=AED, totalPrice=0, orderItemId=2767, itemCustomizeGroup=MapRecord[{itemModifierGroupAttributes=[Ljava.lang.Object;@4658e00e, customizeGroupName=Burger Selection}], itemCustomize=MapRecord[{itemCustomizeGroupId=2308, itemId=18622, itemName=Chicken Burger, isAvailabel=true, updated_at=2023-01-27 15:59:31, itemCurrency=AED, itemCost=0.00, created_at=2023-01-27 15:59:31, itemModifierAttributes=[Ljava.lang.Object;@ee87ab1, id=7539, itemImage=https://d2cvcbugmdflrn.cloudfront.net/e777bb72-a445-4162-9844-f9bda5e7a3d0.jpg}], itemId=18622, itemCustomizeGroupId=2308, createdAt=2023-01-27 16:48:02, itemCustomizeId=7539, price=0.00, ID=1153, updatedAt=2023-01-27 16:48:02}]", "MapRecord[{quantity=1, itemCurrency=AED, totalPrice=2, orderItemId=2767, itemCustomizeGroup=MapRecord[{itemModifierGroupAttributes=[Ljava.lang.Object;@d33701d, customizeGroupName=Fries Selection}], itemCustomize=MapRecord[{itemCustomizeGroupId=2309, itemId=18622, itemName=French Fries, isAvailabel=true, updated_at=2023-01-27 15:59:31, itemCurrency=AED, itemCost=2.00, created_at=2023-01-27 15:59:31, itemModifierAttributes=[Ljava.lang.Object;@f5ca635, id=7542, itemImage=null}], itemId=18622, itemCustomizeGroupId=2309, createdAt=2023-01-27 16:48:02, itemCustomizeId=7542, price=2.00, ID=1154, updatedAt=2023-01-27 16:48:02}]" ],
I've seen a bunch of posts where this is happening, any idea why does this happen?
Created ‎01-29-2023 08:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, i figured it out. For anyone having a similar issue:
This happens when you are using a JSON array using any of the built-in readers, with the settings of schema set to "infer schema" and the data type within your array changes,
For instance consider the following array :
{
"items": [
{
"name": "John Doe",
"type": []
},
{
"name": "Jane Doe",
"type": [
{
"some_key": [
{
"some_nested_key": "value"
}
]
}
]
}
]
}
Notice in the items array, the type of first index is an empty array, in the second index the type has a nested array.
When using 'infer schema', nifi scans the first element to construct the schema on the fly, which means it would map the "type" key as an array, and store it without storing any schema for the nested array that's present in the second index, the result would be type:[MapRecord...].
The solution is to create your schema with entire structure first, simply remove the first element where type is an empty array and use http://www.dataedu.ca/avro to generate your schema, once that is done use AvroSchemaRegistry to save the schema and configure your recordreader to use the schema instead
Created ‎01-29-2023 08:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, i figured it out. For anyone having a similar issue:
This happens when you are using a JSON array using any of the built-in readers, with the settings of schema set to "infer schema" and the data type within your array changes,
For instance consider the following array :
{
"items": [
{
"name": "John Doe",
"type": []
},
{
"name": "Jane Doe",
"type": [
{
"some_key": [
{
"some_nested_key": "value"
}
]
}
]
}
]
}
Notice in the items array, the type of first index is an empty array, in the second index the type has a nested array.
When using 'infer schema', nifi scans the first element to construct the schema on the fly, which means it would map the "type" key as an array, and store it without storing any schema for the nested array that's present in the second index, the result would be type:[MapRecord...].
The solution is to create your schema with entire structure first, simply remove the first element where type is an empty array and use http://www.dataedu.ca/avro to generate your schema, once that is done use AvroSchemaRegistry to save the schema and configure your recordreader to use the schema instead
Created ‎12-13-2023 03:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank u a lot!
Created ‎12-13-2023 03:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By the way, this problem has been solved in the newer versions of JsonRecordSetWriter. For example, 1.23.2.
