Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi size based File Split

avatar
New Contributor

Hi ,

I hava use case to split a large file to 0.5 GB files,I was able to slit the file to 0.5 GB but thw split is not record oriented.I see records splitting in the middle.

E.g.

Original File:

abc|12324|abc|1234

aaa|12324|abc|1234

ccc|12324|abc|1234

ddd|12324|abc|1234

Split File1

abc|12324|abc|1234

aaa|12324|

Split File2:

abc|1234

ccc|12324|abc|1234

ddd|12324|abc|1234

I am using split text using Split Text Processor,I have attached the screnshot below.

What am I doing wrong? Can anyone direct me to examples / templates?

Thanks!

Hemanth

93758-capture.jpg

1 REPLY 1

avatar
Master Guru
@Hemanth Vakacharla

i think for this case we need to split the records one line each by using SplitRecord/SplitText processor.

Then Using MergeContent processor we can do 500 MB splits by using this way we are not going to have splitting records in between.

Flow:

1.SplitRecord/SplitText //split the flowfile 1 line each
2.MergeRecord/MergeContent //to get 500MB filesize

93768-screen-shot-2018-12-01-at-30055-pm.png

To force merge flowfiles use MaxBigAge property like 30 mins..etc.

In case if you are using Record oriented processors we need to define Record Writer/Reader with avro schema to read/write the flowfile.

Refer to this link for more details regards to merge content processor.