Created 01-22-2025 12:47 PM
Hi,
I have a json in the below format
[
{
"order_id": 1,
"category": 123
},
{
"order_id": 2,
"category": 123
},
{
"order_id": 3,
"category": 456
},
{
"order_id": 4,
"category": 321
}
]
I want to create flowfiles seperatly for all the categories,
flowfile1 - orders of category 123:
[
{
"order_id": 1,
"category": 123
},
{
"order_id": 2,
"category": 123
}
]
flowfile2 - orders of category 456
[
{
"order_id": 3,
"category": 456
}
]
flowfile3 - orders of category 321:
[
{
"order_id": 4,
"category": 321
}
]
I don't want to use splitjson, because this will be applied on millions of records. I don't want millions of flowfiles.
I tried EvaluateJsonPath and RouteOnAttribute
all_category : $.[*].category - this creates on attribute with all category values
RouteOnAttribute
category_123 : ${all_category:contains(123)}
category_321 : ${all_category:contains(321)}
category_456 : ${all_category:contains(456)}
But all the 3 flowfiles routed has all the order details.
Please guide me here, I am new to NIFI
Created 01-22-2025 12:52 PM
@AbhiTryingAgain Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @MattWho @SAMSAL who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 01-24-2025 07:01 AM
@AbhiTryingAgain
Welcome to the community and to Apache NiFi.
Before building Dataflows it is important to understand the basics of a NiFi FlowFile. This will help you navigate the available processor components and the expectations on what they do at a high level. NiFi utilizes FlowFile so that it can remain data agnostic allowing NiFi to handle content of any type. Now performing actions against the content of a FlowFile would require processors that can understand the data format of the content.
A NiFi FlowFile is what is transferred from one NiFi Processor on the canvas to the next.
A FlowFile consists of two parts:
Understanding above, NiFi processors like "RouteOnAttribute" never look at the content of a FlowFile, it only looks at the FlowFile attributes of a FlowFile and route the FlowFile to the specified dynamic downstream relationship. So when you setup three routes, they all evaluated to 'true' and FlowFile was cloned and routed to all three downstream relationships.
What you need is a NiFi processor that will evaluate the multiple json records in your FlowFile's content and output multiple FlowFiles based on a unique category value.
For this, I think the PartitionRecord processor is whet you could use. This avoids splitting you Json record in to multiple records and then merging the various splits back into single FlowFiles based on category. You can then use the JsonTreeReader and JsonRecordSetWriters. Based on your example, configurations would look like this:
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 02-03-2025 11:43 AM
@AbhiTryingAgain Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,