- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Split JSON flow file into JSON objects
- Labels:
-
Apache NiFi
Created ‎07-05-2018 10:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
My Flow file contains more then one JSON objects
{ Nested JSON object 1 }{ Nested JSON object 2 }{ Nested JSON object N }
Can you please help to split JSON flow file into separate JSON objects using Nifi processors.
What should be the JsonPath Expression in SplitJSON processor?
thanks in advance!
Created on ‎07-05-2018 12:50 PM - edited ‎08-18-2019 01:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We cannot use SplitJson processor for this flowfile content as this is not an valid json format, if you want to use SplitJson Processor then we need to build an valid json array with all json objects in it by using replace text.. etc processor then feed the valid json message to SplitJson Processor.
Use Split Content processor to split the flowfile content based on the Byte Sequence Format and value is 7D for }
Include the byte sequence and byte sequence location as trailing, by using these configs you are going to have a valid json object in each flowfile. even you can mention byte sequence format in Text also then specify byte sequence as } .
In addition if your each json object is in one line then you can use Split Text processor with line split count value as 1 and if you are facing any OOM issues then follow this approach to avoid OOM issues.
Created on ‎07-05-2018 12:50 PM - edited ‎08-18-2019 01:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We cannot use SplitJson processor for this flowfile content as this is not an valid json format, if you want to use SplitJson Processor then we need to build an valid json array with all json objects in it by using replace text.. etc processor then feed the valid json message to SplitJson Processor.
Use Split Content processor to split the flowfile content based on the Byte Sequence Format and value is 7D for }
Include the byte sequence and byte sequence location as trailing, by using these configs you are going to have a valid json object in each flowfile. even you can mention byte sequence format in Text also then specify byte sequence as } .
In addition if your each json object is in one line then you can use Split Text processor with line split count value as 1 and if you are facing any OOM issues then follow this approach to avoid OOM issues.
Created ‎07-05-2018 02:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As of NiFi 1.7.0 (via NIFI-4456) you can configure the JsonReader to read the above JSON, meaning you can use SplitRecord if you really need to split them up. However depending on your use case, you may be able to use the record-aware processors and/or JoltTransformJSON to handle all the objects in one flow file.
Created ‎07-06-2018 05:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much @Shu and @Matt Burgess !!
@Shu, Split Content processor really does great thing! I have managed to split the nested JSON objects into separate objects using Text as Byte Sequence Format and }{ as Byte Sequence. Thank you very much for your advise!!
with all the best wishes,
Gulshan
Created on ‎05-29-2019 03:18 PM - edited ‎08-18-2019 01:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to split complexed json arrays into individual json objects with SplitJson processor in NIFI? I don't know how to configure the relationship original, split, failure. Json arrays is below
{
"scrollId1": "xyz",
"data": [
{
"id": "app-server-dev-glacier",
"uuid": "a0733c21-6044-11e9-9129-9b2681a9a063",
"name": "app-server-dev-glacier",
"type": "archiveStorage",
"provider": "aws",
"region": "ap-southeast-1",
"account": "164110977718"
},
{
"id": "abc.company.archive.mboi",
"uuid": "95100b11-6044-11e9-977a-f5446bd21d81",
"name": "abc.company.archive.mboi",
"type": "archiveStorage",
"provider": "aws",
"region": "us-east-1",
"account": "852631421774"
}
]
}
I need to split it into
{
"id": "app-server-dev-glacier",
"uuid": "a0733c21-6044-11e9-9129-9b2681a9a063",
"name": "app-server-dev-glacier",
"type": "archiveStorage",
"provider": "aws",
"region": "ap-southeast-1",
"account": "164110977718"
},
{
"id": "abc.company.archive.mboi",
"uuid": "95100b11-6044-11e9-977a-f5446bd21d81",
"name": "abc.company.archive.mboi",
"type": "archiveStorage",
"provider": "aws",
"region": "us-east-1",
"account": "852631421774"
}
Next, I need to insert another field "time" in front of "id", the first attribute of individual object.
I used SplitJson processor, and JSON Path Expression is $.data.id.*, but the relationship reports error. Don't know how to config relationship branches, original, split and failure. Any one have any advice? @Shu
