Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Split 10GB size of XML/CSV file into multiple files. Extracted files should be valid ones

avatar
New Contributor

I have scenario, where I'm not able to find the processors for fulfilling the requirement.
Scenario: I have an XML/CSV file of size 10 GB. I have to split the file into multiple files which each are of size maximum 50MB.
My system configurations are 16GB RAM, 160GB HDD and Apache NiFi 1.5.0, Java 8, Linux in a dedicated server.

1 REPLY 1

avatar
Master Guru
@Raju Chigicherla

Instead of splitting the file in one SplitText processor try with series of SplitText/SplitContent processors to split the 10GB file.

(or)

Use record oriented processors like SplitRecord and configure the processor to records per split that gives 50MB files, if you are still having issues with SplitRecord processor then use series of SplitRecord processors to get 50MB files.

In addition to split xml files NiFi 1.7 introduced XmlReader/Writer controller services by using them we can split xml data in split record processor.

Refer to this and this links to split big file by using series of Split processors.