Support Questions

Find answers, ask questions, and share your expertise

NIFI | nifi Extract content from json array into separate attribute

avatar

Hi, 

I have a json in the below format

[
{
"order_id": 1,
"category": 123
},
{
"order_id": 2,
"category": 123 
},
{
"order_id": 3,
"category": 456
},
{
"order_id": 4,
"category": 321
}
]

 

I want to create flowfiles seperatly for all the categories,

flowfile1 - orders of category 123:

[
{
"order_id": 1,
"category": 123
},
{
"order_id": 2,
"category": 123 
}

]

flowfile2 - orders of category 456

[

{
"order_id": 3,
"category": 456
}

]

flowfile3 - orders of category 321:

[

{
"order_id": 4,
"category": 321
}
]

 

I don't want to use splitjson, because this will be applied on millions of records. I don't want millions of flowfiles.
I tried EvaluateJsonPath and RouteOnAttribute

all_category : $.[*].category - this creates on attribute with all category values

RouteOnAttribute 
category_123 : ${all_category:contains(123)}
category_321 : ${all_category:contains(321)}
category_456 : ${all_category:contains(456)}

But all the 3 flowfiles routed has all the order details.

Please guide me here, I am new to NIFI

 

3 REPLIES 3

avatar
Community Manager

@AbhiTryingAgain Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our NiFi experts @MattWho @SAMSAL  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Senior Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Mentor

@AbhiTryingAgain 

Welcome to the community and to Apache NiFi.

Before building Dataflows it is important to understand the basics of a NiFi FlowFile.  This will help you navigate the available processor components and the expectations on what they do at a high level.  NiFi utilizes FlowFile so that it can remain data agnostic allowing NiFi to handle content of any type.  Now performing actions against the content of a FlowFile would require processors that can understand the data format of the content.

A NiFi FlowFile is what is transferred from one NiFi Processor on the canvas to the next.
A FlowFile consists of  two parts:

  1. FlowFile Content- This is the actual binary data/content.  NiFi persist this content in content claims within the NiFi content_repository.
  2. FlowFile Attributes/Metadata - This is attributes and metadata about the content and FlowFile. At the most basic level, all FlowFiles will have timestamps, filename, etc attributes.  Various NiFi processor will add even more attributes to a FlowFile. 

Understanding above, NiFi processors like "RouteOnAttribute" never look at the content of a FlowFile, it only looks at the FlowFile attributes of a FlowFile and route the FlowFile to the specified dynamic downstream relationship.   So when you setup three routes, they all evaluated to 'true' and FlowFile was cloned and routed to all three downstream relationships.

What you need is a NiFi processor that will evaluate the multiple json records in your FlowFile's content and output multiple FlowFiles based on a unique category value.

For this, I think the PartitionRecord processor is whet you could use.  This avoids splitting you Json record in to multiple records and then merging the various splits back into single FlowFiles based on category.  You can then use the JsonTreeReader and JsonRecordSetWriters.  Based on your example, configurations would look like this:

MattWho_0-1737730552733.png

MattWho_1-1737730588906.png

MattWho_3-1737730783060.png

 

Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

 

avatar
Community Manager

@AbhiTryingAgain Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.


Regards,

Diana Torres,
Senior Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: