- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to split flow file into multiple Json files
- Labels:
-
Apache NiFi
Created ‎05-08-2018 07:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm struggling to put Json into HBase as the file I'm receiving is as below. I'm able to parse individual json objects but I have no idea how to iterate over the file.
I tried splitText, splitContent, splitRecord with no luck. Any help will be very much appreciated.
{ "version": "1", "source": { "type": "internal", "id": "123" } } { "version": "1", "source": { "type": "external", "id": "456" } }
Created on ‎05-08-2018 10:11 PM - edited ‎08-18-2019 01:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Use SplitContent processor with below configs
Byte Sequence Format
Text
Byte Sequence
}
}
this sequence needs to match exactly with your json message (i.e in your question you are having 1 tab for first curly braces } and in new line you are having another curly braces).
Keep Byte Sequence
true //this determines whether you need to add the byte sequence or not
Byte Sequence Location
Trailing //if keep byte sequence set to true this property adds the sequence to the end.
So from this processor you are able to split the shown message into 2 individual messages.
Output flowfiles from splits relationship:-
ff1:-
{ "version": "1", "source": { "type": "external", "id": "456" } }
ff2:-
{ "version": "1", "source": { "type": "internal", "id": "123" } }
In addition if you want to make these individual messages into valid json array of messages
Then you can use Merge content processor with defragment as merge strategy and change the below properties
Delimiter Strategy
Text
Header
[
Footer
]
Demarcator
,
In this processor we are again merging all the individual json messages into one json array with comma as demarcator.
Output flowfile:-
[{ "version": "1", "source": { "type": "internal", "id": "123" } }, { "version": "1", "source": { "type": "external", "id": "456" } }]
Now you can use record based processors(convert record..etc) processors to work on chunks of data instead of each message at a time.
-
If the Answer addressed your question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created ‎05-10-2018 11:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot ... really appreciate it. I was struggling to get this done, you saved a lot of my time.
