Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache NiFi - buffered input and output

Highlighted

Apache NiFi - buffered input and output

New Contributor

Hello everybody, I need to process large text files in NiFi, hence I would like to read the content of a flowfile using a buffered reader, perform some filtering/normalization on the records, and then write the new content into an output flowfile. So far I have been doing this in memory, splitting the input flowfile file into chunks of a couple MBs, but I would like to explore the possibility of doing this on disk and try to increase throughput by allowing more threads to run. It seems to me I am not able to open an OutputStream to write the new content while I still have an open InputStream to read the original content. Maybe I am doing some coding wrong. Is there a suggested template to perform this? Thank you all

Don't have an account?
Coming from Hortonworks? Activate your account here