I have scenario, where I'm not able to find the processors for fulfilling the requirement.
Scenario: I have an XML/CSV file of size 10 GB. I have to split the file into multiple files which each are of size maximum 50MB.
My system configurations are 16GB RAM, 160GB HDD and Apache NiFi 1.5.0, Java 8, Linux in a dedicated server.
Instead of splitting the file in one SplitText processor try with series of SplitText/SplitContent processors to split the 10GB file.
Use record oriented processors like SplitRecord and configure the processor to records per split that gives 50MB files, if you are still having issues with SplitRecord processor then use series of SplitRecord processors to get 50MB files.
In addition to split xml files NiFi 1.7 introduced XmlReader/Writer controller services by using them we can split xml data in split record processor.