@hegdemahendra
This is a classic case of off-heap memory consumption in NiFi. The 3G you see in the GUI only represents JVM heap + non-heap memory, but NiFi uses significant additional memory outside the JVM that doesn't appear in those metrics. Next time could you share your deployment YAML files that would help with solutioning
Root Causes of Off-Heap Memory Usage:
- Content Repository (Primary Culprit)
- NiFi uses memory-mapped files for the content repository
- Large FlowFiles are mapped directly into memory
- This memory appears as process memory but not JVM memory
- Provenance Repository
- Uses Lucene indexes that consume off-heap memory
- Memory-mapped files for provenance data storage
- Native Libraries
- Compression libraries (gzip, snappy)
- Cryptographic libraries
- Network I/O libraries
- Direct Memory Buffers
- NIO operations use direct ByteBuffers
Network and file I/O operations
Possible Solutions:
1. Reduce JVM Heap Size
# Instead of 28G JVM heap, try:
NIFI_JVM_HEAP_INIT: "16g"
NIFI_JVM_HEAP_MAX: "16g"
This leaves more room (24G) for off-heap usage.
2. Configure Direct Memory Limit
Add JVM arguments:
-XX:MaxDirectMemorySize=8g
3. Content Repository Configuration
In nifi.properties:
# Limit content repository size
nifi.content.repository.archive.max.retention.period=1 hour
nifi.content.repository.archive.max.usage.percentage=50%
# Use file-based instead of memory-mapped (if possible)
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
4. Provenance Repository Tuning
# Reduce provenance retention
nifi.provenance.repository.max.storage.time=6 hours
nifi.provenance.repository.max.storage.size=10 GB
Long-term Solutions:
1. Increase Pod Memory Limit
resources:
limits:
memory: "60Gi" # Increase from 40G
requests:
memory: "50Gi"
2. Monitor Off-Heap Usage
Enable JVM flags for better monitoring:
-XX:NativeMemoryTracking=summary
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintNMTStatistics
3. Implement Memory-Efficient Flow Design
- Process smaller batches
- Avoid keeping large FlowFiles in memory
- Use streaming processors where possible
- Implement backpressure properly
4. Consider Multi-Pod Deployment
Instead of single large pod, use multiple smaller pods:
# 3 pods with 20G each instead of 1 pod with 40G
replicas: 3
resources:
limits:
memory: "20Gi"
Monitoring Commands:
# Check native memory tracking
kubectl exec -it <nifi-pod> -- jcmd <pid> VM.native_memory summary
# Monitor process memory
kubectl top pod <nifi-pod>
# Check memory breakdown
kubectl exec -it <nifi-pod> -- cat /proc/<pid>/status | grep -i mem
Start with reducing JVM heap to 16G and implementing content repository limits. This should immediately reduce OOM occurrences while you plan for longer-term solutions. Always remember to share your configuration files with the vital data masked or scramble.
Happy hadooping