We have nine datanodes, each node has 125 GB ram, 32 cores, running on HDP-2.6.5, and each has 2 TB of disk space. We have ingested 1.3 TB into HBase, and are seeing performance degradation when running batch GETs via the java API on non-contiguous data. A batch-get of size 2500 takes 50 seconds. As a test, we ingested only 33 GB, and a batch-get of 750 takes hundreds of milli-seconds. There are six tables, each with one column family, and each pre-split with roughly 30 regions per server. The row-space is evenly distributed, and we are not running into issues with hot spotting. For the batch-get example of size 2500, the logs shows “#1, waiting for 2289 actions to finish,” with the number of actions decreasing every 10 seconds, until the results are returned at 50 seconds. We interface through Ambari – we have not fine-tuned nor optimized yet as the data has increased. We are not sure on how to do so, are there any suggestions for proceeding? Is there documentation on operating and maintaining HBase at scale, or footprint suggestions?
... View more