Created 09-27-2016 01:25 PM
Hi guys, i am using Nifi 1.0 is there any memory leak issue ? i am using around 123 processors and 50Mb * 200 flowfiles, after processing 100 files , all processors starts throwing this exception java.lang.OutOfMemoryError: Java heap space.
I have allocated 4GB ram in bootstrap.conf and my flow works perfectly fine for the first 100 files. Please suggest any optimizations required. Is NIFI built to use this amount of processors and files.
Created 09-27-2016 02:16 PM
NiFi can certainly handle dataflow with excess of 123 processors and well in excess of the number of FlowFiles you have here. Different processors exhibit different resource (CPU, Memory, and disk I/O) strain on your hardware. In addition to processors having an impact on memory, so do FlowFiles themselves. FlowFiles are a combination of the Physical content (stored in the NiFi content Repository) and FlowFile Attributes (Metadata associated to the content stored in heap memory). You can experience heap memory issues if your FlowFiles have very large attributes maps. (for example extracting the large amounts of content into attributes.) The first step is identifying which processor(s) in your flow are memory intensive resulting in your OutofMemoryError. Processors such as SplitText, SplitXML, and MergeContent can use a lot of heap if they are producing a lot of split files from a single file or merging a large number of files in to a single file. Th reason being is the merging and splitting is happening in memory until resulting FlowFile(s) are committed to the output relationship. There are ways of handling this resource exhaustion via dataflow design. (for example, merging a smaller number of files multiple times (using multiple MergeContent processors) to produce that one large file or splitting files multiple times (using multiple Split processors). Also be mindful of the number of concurrent tasks assigned to these memory intensive processors.
Running with 4 GB of heap is good, but depending on your dataflow, you may find yourself needing 8 GB or more of heap to satisfy the demand created by your dataflow design.
Thanks,
Matt
Created 09-28-2016 06:25 AM
hi @mclark
i am not using any of these processors i am using replace text, the main problem is my nifi flow is able to process around 250 files. but after that even if i give 1 file to process it gives OutofMemory error i am using -XX:+UseG1GC in bootstrap.conf. I am thinking as if the old processed files memory is not freed up causing out of memory issue.
Created 09-27-2016 03:05 PM
Could you please list the processors you have in the flow?
The processors Matt notes can use a decent chunk of memory but it is not really based on original size of the input entry. It is more about the metadata for the individual flowfiles themselves. So a large input file does not necessarily mean a large heap usage. The metadata for the flowfiles is in memory but typically a very small amount of content is ever in memory.
Some processors though do use a lot of memory for one reason or another. We should probably put warnings about them in their docs and on the UI.
Let's look through the list and identify candidates.
Created 09-28-2016 06:11 AM
Hi most of the time it is Replace Text which is throwing this error, more over the file size is 2.5 mb and there is only 1 file in queue. The processor already processed around 1.5 GB data, but after that not able to process a single file. I am using java 8 and tried it on nifi 0.7 and nifi 1.0
Created 10-02-2025 06:01 AM
In my opinion, the java.lang.OutOfMemoryError: Java heap space that you are experiencing in NiFi may not be due to a built-in memory leak as you say. This could be caused because of the result of the workload exceeding the allocated heap. With such a huge number of processors, 123 and large flowfiles (50 MB × 200), memory demand definitely will grow rapidly, and the 4 GB heap configured in bootstrap.conf may not be sufficient. In general, NiFi is designed to handle large data flows, but of course it requires proper tuning. You can try to decrease the memory usage by increasing the heap size (if hardware allows), adjusting processor concurrency so heavy processors don’t run in parallel, and configuring back pressure to limit queued flowfiles. All these can actually help with reducing memory pressure. Additionally, efficient use of NiFi’s repositories (content, flowfile, provenance) ensures less reliance on heap memory. Following these optimisations can support your NiFi instance to handle the workload more effectively and also to avoid frequent OutOfMemory errors.
To know the different types of OutOfMemoryError and how to resolve them, you can refer to this blog: Types of OutOfMemoryError