Following is out Kudu Cluster configuration
Number of Master :- 3
Number of tablet servers :- 5
each tablet server has 64GB RAM and an 8 core CPU
Total number of tables is 254 and total number of tablets is 1265 with replication factor of 3, so post replication there are 3795 replicas
replica distribution across all the tablets is even 759 per tablet server
--block-cache-size is 2 GB --memory-hard-limit 50GB
For running queries on top of Kudu we use Impala. For Ingesting data to kudu we use a multi threaded Application built on top of Kudu Java Client
We ingest data to kudu in real time using the application which we have built on Java Client.
Under normal load memory usage on all the tablets is between 35-40 GB
Issue arises when we backfill the data for some of the tables.
So what can be the possible reasons for memory usage not going down. What can I do to bring the memory usage down?
Due of this we often hit the memory threshold and kudu stops accepting the writes and we are forced to restart kudu.