Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Splitting a Nifi flowfile into multiple flowfiles

avatar
Expert Contributor

Hi All,

I have the following requirement:

Split a single NiFi flowfile into multiple flowfiles, eventually to insert the contents (after extracting the contents from the flowfile) of each of the flowfiles as a separate row in a Hive table.

Sample input flowfile:

MESSAGE_HEADER | A | B | C

LINE|1 | ABCD | 1234

LINE|2 | DEFG | 5678

LINE|3 | HIJK | 9012

.

.

.

Desired output files:

Flowfile 1:

MESSAGE_HEADER | A | B | C

LINE|1 | ABCD | 1234

Flowfile 2:

MESSAGE_HEADER | A | B | C

LINE|2 | DEFG | 5678

Flowfile 3:

MESSAGE_HEADER | A | B | C

LINE|3 | HIJK | 9012

.

.

.

The number of lines in the flowfile is not known ahead of time.

I would like to know what's the best way to accomplish this with the different NiFi processors that are available; The splitting can be done at the flowfile level or after the contents of the flowfile are extracted out of the flowfile, but before Hive insert statements are created.

Thanks.

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Expert Contributor

@jfrazee Thank you; I'm going the SplitText route for now, it seems to work;

for the purposes of saving the split files, for later reference, how do I assign different names (I'm thinking may be pre or postpend UUID to the file name) to the child/split flowfiles; when I looked at it, all of the child files are getting the same name as the parent flowfile, which is causing child flowfiles to be overwritten.

avatar
Contributor

@jfrazee @Raj B

how did you save it in file? Getfile -> splitText -> PutFile ?

avatar
Expert Contributor

@mel mendoza, in my case, after splitting the files, I was doing further processing on the split files; but if your requirement is to store/write the split files, you could use PutFile or PutHDFS to write to local file system or HDFS.