Support Questions

Find answers, ask questions, and share your expertise

Where do all cache entries for DistributedMapCacheServer are stored in nifi file system. What is the maximum number of entries DistributedMapCacheServer can hold. The default value is set to be 10000; incase if we want to increase to 100000, does this require any infra changes in on server or filesy

avatar
Explorer
 
2 REPLIES 2

avatar
Master Mentor

@Chetan_mn 

NiFi's DistributedMapCacheServer controller service has existed in Apache NiFi since the 0.x releases when Apache NiFi offered no HA at all.  In the Apache NiFi 0.x releases NiFi had a dedicated NiFi Cluster Manager (NCM) and NiFi cluster nodes that all reported to the NCM.  The only way to access the NiFi UI was via the NCM server.  The DistributedMapCacheServer controller service at that time only ran on the NCM and not on any of the cluster nodes.  The DistributedMapCacheClient controller service ran on all the nodes so that all cluster nodes could read and write to the same cache server.

Fast forward to Apache NiFi 1.x+ releases where Apache NiFi eliminated the NCM with a new zero master clustering ability.  This provide HA at the NiFI control layer so that users could access the NiFi cluster via ANY connected node's URL.  Since there was not dedicated NCM anymore controller services like DistributedMapCacheServer when added now start a DistributedMapCache server on each node independently of one another.  The multiple cache servers do NOT communicate or share cache entries with one another.  So you effectively have a single point of failure with this cache provider.  If the node which your DistibributedMapCacheClient is configured for goes down, you have an outage until it is recovered.

Apache NiFi offers better options that do offer a true distributedMapCache capability now like HBase, Redis, Hazelcast, Redis, etc.  These utilize an externally installed map cache service that would offer better fault tolerance through HA, but adds an additional service dependency.

Now if you still choose to use the DistributedMapCacheServer, keep in mind that all cached entries will be held in NiFi's heap memory.  So the large the cache entry is and the larger the number of cache entries held, the more NiFi heap that will be consumed.

The DistributedMapCacheServer has an optional configuration for "Persistence Directory".  When configured, the cache entries will be persisted to disk on the location configured. The amount of space required again depends on cache entry size and number of possible entries to retain.  Keep in mind that configuring this persistence directory does NOT remove cache entries from NiFi's heap memory. It simply persist a copy of the cache entries to disk so that should the NiFi be restarted, NiFi can reload the cache from disk in to heap memory.  If no persistence directory is configured, NiFi going down would result in a loss of all cache entries.

If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Community Manager

@Chetan_mn Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: