Created 11-19-2015 10:46 PM
When we run HDF on a single machine , does all the data flow build on that machine run under a single JVM?
I did see in Nifi documents which talks about how how you can control the spill the data from JVM to hardisk. But is there option to run via multiple JVM say one for each flow. Also How big of a JVM size you usually have for a edge node.
Created 11-19-2015 11:27 PM
When you run HDF on a single machine it is a single JVM process. That instance has three internal repositories (content, flow file, and provenance) which are what can be controlled in the configuration as to how much of the repositories is retained on disk. For high performance it is best to have each of the repositories using a separate disk.
One instance can have many logical flows which can be optionally grouped inside processors groups. There can be many disconnected logical flows with in one instance. There are discussions for future capabilities where logical flows could be restricted to only certain groups of users.
The default for memory requirements for NiFi out-of-the-box are 512MB, so at the moment that would probably be the starting point for an edge node.
Created 11-19-2015 11:27 PM
When you run HDF on a single machine it is a single JVM process. That instance has three internal repositories (content, flow file, and provenance) which are what can be controlled in the configuration as to how much of the repositories is retained on disk. For high performance it is best to have each of the repositories using a separate disk.
One instance can have many logical flows which can be optionally grouped inside processors groups. There can be many disconnected logical flows with in one instance. There are discussions for future capabilities where logical flows could be restricted to only certain groups of users.
The default for memory requirements for NiFi out-of-the-box are 512MB, so at the moment that would probably be the starting point for an edge node.
Created 11-20-2015 01:09 AM
Thanks @bbende!! So we do not recommend scaling Nifi vertical by increasing the heap size for the JVM to really large size?
Created 11-20-2015 01:32 AM
We definitely do recommend increasing the heap appropriately for the given use case. I was focused on some of the other aspects and forgot to mention that 🙂
In conf/bootstrap.conf there are the following default settings which can be increased:
# JVM memory settings
java.arg.2=-Xms512m
java.arg.3=-Xmx512m
It is hard to recommend a general heap size for all use cases, but anywhere from 512MB up to 8GB is common.