Member since
12-03-2017
155
Posts
26
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1810 | 11-03-2023 12:17 AM | |
3771 | 12-12-2022 09:16 PM | |
1433 | 07-14-2022 03:25 AM | |
2198 | 07-28-2021 04:42 AM | |
2775 | 06-23-2020 10:08 PM |
06-04-2025
11:36 PM
@hegdemahendra This is a classic case of off-heap memory consumption in NiFi. The 3G you see in the GUI only represents JVM heap + non-heap memory, but NiFi uses significant additional memory outside the JVM that doesn't appear in those metrics. Next time could you share your deployment YAML files that would help with solutioning Root Causes of Off-Heap Memory Usage: Content Repository (Primary Culprit) NiFi uses memory-mapped files for the content repository Large FlowFiles are mapped directly into memory This memory appears as process memory but not JVM memory Provenance Repository Uses Lucene indexes that consume off-heap memory Memory-mapped files for provenance data storage Native Libraries Compression libraries (gzip, snappy) Cryptographic libraries Network I/O libraries Direct Memory Buffers NIO operations use direct ByteBuffers Network and file I/O operations Possible Solutions: 1. Reduce JVM Heap Size # Instead of 28G JVM heap, try:
NIFI_JVM_HEAP_INIT: "16g"
NIFI_JVM_HEAP_MAX: "16g" This leaves more room (24G) for off-heap usage. 2. Configure Direct Memory Limit Add JVM arguments: -XX:MaxDirectMemorySize=8g 3. Content Repository Configuration In nifi.properties: # Limit content repository size
nifi.content.repository.archive.max.retention.period=1 hour
nifi.content.repository.archive.max.usage.percentage=50%
# Use file-based instead of memory-mapped (if possible)
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository 4. Provenance Repository Tuning # Reduce provenance retention
nifi.provenance.repository.max.storage.time=6 hours
nifi.provenance.repository.max.storage.size=10 GB Long-term Solutions: 1. Increase Pod Memory Limit resources:
limits:
memory: "60Gi" # Increase from 40G
requests:
memory: "50Gi" 2. Monitor Off-Heap Usage Enable JVM flags for better monitoring: -XX:NativeMemoryTracking=summary
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintNMTStatistics 3. Implement Memory-Efficient Flow Design Process smaller batches Avoid keeping large FlowFiles in memory Use streaming processors where possible Implement backpressure properly 4. Consider Multi-Pod Deployment Instead of single large pod, use multiple smaller pods: # 3 pods with 20G each instead of 1 pod with 40G
replicas: 3
resources:
limits:
memory: "20Gi" Monitoring Commands: # Check native memory tracking
kubectl exec -it <nifi-pod> -- jcmd <pid> VM.native_memory summary
# Monitor process memory
kubectl top pod <nifi-pod>
# Check memory breakdown
kubectl exec -it <nifi-pod> -- cat /proc/<pid>/status | grep -i mem Start with reducing JVM heap to 16G and implementing content repository limits. This should immediately reduce OOM occurrences while you plan for longer-term solutions. Always remember to share your configuration files with the vital data masked or scramble. Happy hadooping
... View more
04-27-2025
11:40 PM
@MattWho I see below logs in dumps, apart from these I dont see anything like waiting etc "Variable Registry Update Thread" Id=70591 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@7f1aa08e at java.base@11.0.22/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.22/java.util.concurrent.locks.LockSupport.park(Unknown Source) at java.base@11.0.22/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) at java.base@11.0.22/java.util.concurrent.ArrayBlockingQueue.take(Unknown Source) at java.base@11.0.22/java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at java.base@11.0.22/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base@11.0.22/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base@11.0.22/java.lang.Thread.run(Unknown Source) ------------- "Timer-Driven Process Thread-102" Id=1030 BLOCKED on org.apache.nifi.controller.scheduling.LifecycleState@3ed4243f at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:147) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.base@11.0.22/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base@11.0.22/java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.base@11.0.22/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.base@11.0.22/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base@11.0.22/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base@11.0.22/java.lang.Thread.run(Unknown Source) Number of Locked Synchronizers: 1 - java.util.concurrent.ThreadPoolExecutor$Worker@2843111c Thanks mahendra
... View more
03-27-2025
11:03 PM
Thank you so much @MattWho for the detailed answer. The retry logic helped a lot, I have added 'RetryFlowFile' processors in between to avoid infinite loop of retry.
... View more
12-24-2024
06:00 AM
1 Kudo
@hegdemahendra As far as your issue goes, it would probably be useful to collect a series of thread dumps (at least spaced 5 minutes apart). Then you would be looking for any threads related to the stopping of components to see if they are progressing or hung. Is it stuck on stopping a specific processor or processor class? Do any of the processors that are being stopped have active threads showing for them? Thank you, Matt
... View more
12-18-2024
12:41 AM
1 Kudo
Hi @hegdemahendra I am also facing same issue With FetchDistributedMapCache. How can i check the size of the flowfile we are reading, for me also it is set to 1Mb but the data we are saving in cache should not be 1Mb as we are barely saving 100characters in the flowfile payload. How can we check the data size we are trying to get from cache? Thanks Akash
... View more
11-27-2024
02:25 AM
1 Kudo
Hello Experts, I was using "ConsumeAzureEventHub" processor with nifi 1.16.3 and when I configure 'Storage Container Name' field to store consumer group state, processor was automatically creating the container (if not present) in the storage account when processor was started. But in Nifi 1.25, I am seeing a different behavior where it does not auto create the container on processor start, instead just it show container does not exist error. Is this is the expected behaviour in 1.25? if so what is the solution? should we separately create the container before hand and then use in processor? Thanks, Mahendra
... View more
Labels:
- Labels:
-
Apache NiFi
10-03-2024
04:32 AM
1 Kudo
Hello Experts, We have 2 node nifi cluster running on k8 cluster. We want to distribute the incoming http request on specific port to be load balanced across both nodes equally (round robin), do we have anyway in kubernetes for this? Tried headless service & ClusterIP but did not work as expected. Is there any other way to achieve this without external load balancers like AWS ELB etc. Thanks Mahendra
... View more
Labels:
- Labels:
-
Apache NiFi
09-09-2024
01:52 PM
@hegdemahendra The FlowFile connection back pressure thresholds are soft limits. Once one of the configured back pressure thresholds on reached or exceeded, NiFi will not allow the processor feeding that connection to get scheduled to execute again. So in your case no back pressure is being applied, so the ConsumeAzureEventHub processor is being allowed to scheduled to execute. During a single execution it is consuming more events then the threshold settings. What is the batch size set to in your ConsumeAzureEventHub processor? Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-04-2024
10:18 PM
1 Kudo
@araujo @bbende @MattWho - do you have any suggestions?
... View more
09-04-2024
07:54 AM
Hello @Mais - Were you able to deserialise and consume both key & value ? In my case I am able to get deserialised value but dont see key anywehere!
... View more