Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to split flow file into multiple Json files

Highlighted

How to split flow file into multiple Json files

New Contributor

Hi,

I'm struggling to put Json into HBase as the file I'm receiving is as below. I'm able to parse individual json objects but I have no idea how to iterate over the file.

I tried splitText, splitContent, splitRecord with no luck. Any help will be very much appreciated.

{
	"version": "1",
	"source": {
		"type": "internal",
		"id": "123"
	}
}
{
	"version": "1",
	"source": {
		"type": "external",
		"id": "456"
	}
}
2 REPLIES 2

Re: How to split flow file into multiple Json files

Super Guru
@vivek jain

Use SplitContent processor with below configs

73409-splitcontent.png

Byte Sequence Format

Text

Byte Sequence

 }
}

this sequence needs to match exactly with your json message (i.e in your question you are having 1 tab for first curly braces } and in new line you are having another curly braces).

Keep Byte Sequence

true //this determines whether you need to add the byte sequence or not

Byte Sequence Location

Trailing //if keep byte sequence set to true this property adds the sequence to the end.

So from this processor you are able to split the shown message into 2 individual messages.
Output flowfiles from splits relationship:-

ff1:-

{
"version": "1",
"source": {
"type": "external",
"id": "456"
}
}

ff2:-

{
"version": "1",
"source": {
"type": "internal",
"id": "123"
}
}

In addition
if you want to make these individual messages into valid json array of messages
Then you can use Merge content processor with defragment as merge strategy and change the below properties

Delimiter Strategy

Text

Header

[

Footer

]

Demarcator

,

73410-mergecontent.png

In this processor we are again merging all the individual json messages into one json array with comma as demarcator.

Output flowfile:-

[{
"version": "1",
"source": {
"type": "internal",
"id": "123"
}
},
{
"version": "1",
"source": {
"type": "external",
"id": "456"
}
}]

Now you can use record based processors(convert record..etc) processors to work on chunks of data instead of each message at a time.

-

If the Answer addressed your question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Re: How to split flow file into multiple Json files

New Contributor

Thanks a lot ... really appreciate it. I was struggling to get this done, you saved a lot of my time.

Don't have an account?
Coming from Hortonworks? Activate your account here