Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Split JSON flow file into JSON objects

avatar
Rising Star

Hi!

My Flow file contains more then one JSON objects

{
Nested JSON object 1
}{
Nested JSON object 2
}{
Nested JSON object N
}

Can you please help to split JSON flow file into separate JSON objects using Nifi processors.

What should be the JsonPath Expression in SplitJSON processor?

thanks in advance!

1 ACCEPTED SOLUTION

avatar
Master Guru
@Gulshan Agivetova

We cannot use SplitJson processor for this flowfile content as this is not an valid json format, if you want to use SplitJson Processor then we need to build an valid json array with all json objects in it by using replace text.. etc processor then feed the valid json message to SplitJson Processor.

Use Split Content processor to split the flowfile content based on the Byte Sequence Format and value is 7D for }

79399-splitcontent.png

Include the byte sequence and byte sequence location as trailing, by using these configs you are going to have a valid json object in each flowfile. even you can mention byte sequence format in Text also then specify byte sequence as } .

In addition if your each json object is in one line then you can use Split Text processor with line split count value as 1 and if you are facing any OOM issues then follow this approach to avoid OOM issues.

View solution in original post

4 REPLIES 4

avatar
Master Guru
@Gulshan Agivetova

We cannot use SplitJson processor for this flowfile content as this is not an valid json format, if you want to use SplitJson Processor then we need to build an valid json array with all json objects in it by using replace text.. etc processor then feed the valid json message to SplitJson Processor.

Use Split Content processor to split the flowfile content based on the Byte Sequence Format and value is 7D for }

79399-splitcontent.png

Include the byte sequence and byte sequence location as trailing, by using these configs you are going to have a valid json object in each flowfile. even you can mention byte sequence format in Text also then specify byte sequence as } .

In addition if your each json object is in one line then you can use Split Text processor with line split count value as 1 and if you are facing any OOM issues then follow this approach to avoid OOM issues.

avatar
Master Guru

As of NiFi 1.7.0 (via NIFI-4456) you can configure the JsonReader to read the above JSON, meaning you can use SplitRecord if you really need to split them up. However depending on your use case, you may be able to use the record-aware processors and/or JoltTransformJSON to handle all the objects in one flow file.

avatar
Rising Star

Thank you very much @Shu and @Matt Burgess !!

@Shu, Split Content processor really does great thing! I have managed to split the nested JSON objects into separate objects using Text as Byte Sequence Format and }{ as Byte Sequence. Thank you very much for your advise!!

with all the best wishes,

Gulshan

avatar

109016-routestrategy.png109008-splitjsonconfig.png109000-splitjsonflow.pngHow to split complexed json arrays into individual json objects with SplitJson processor in NIFI? I don't know how to configure the relationship original, split, failure. Json arrays is below


{

"scrollId1": "xyz",

"data": [

{

"id": "app-server-dev-glacier",

"uuid": "a0733c21-6044-11e9-9129-9b2681a9a063",

"name": "app-server-dev-glacier",

"type": "archiveStorage",

"provider": "aws",

"region": "ap-southeast-1",

"account": "164110977718"

},

{

"id": "abc.company.archive.mboi",

"uuid": "95100b11-6044-11e9-977a-f5446bd21d81",

"name": "abc.company.archive.mboi",

"type": "archiveStorage",

"provider": "aws",

"region": "us-east-1",

"account": "852631421774"

}

]

}

I need to split it into

{

"id": "app-server-dev-glacier",

"uuid": "a0733c21-6044-11e9-9129-9b2681a9a063",

"name": "app-server-dev-glacier",

"type": "archiveStorage",

"provider": "aws",

"region": "ap-southeast-1",

"account": "164110977718"

},

{

"id": "abc.company.archive.mboi",

"uuid": "95100b11-6044-11e9-977a-f5446bd21d81",

"name": "abc.company.archive.mboi",

"type": "archiveStorage",

"provider": "aws",

"region": "us-east-1",

"account": "852631421774"

}


Next, I need to insert another field "time" in front of "id", the first attribute of individual object.

I used SplitJson processor, and JSON Path Expression is $.data.id.*, but the relationship reports error. Don't know how to config relationship branches, original, split and failure. Any one have any advice? @Shu