Created on 03-19-2020 02:10 AM - edited 03-19-2020 04:04 AM
In our Hbase cluster, there is a never ending compaction queue, which is also getting larger day by day. We've changed and edited a lot of parameters, but nothing helped. This situtation is also affecting cluster performance and stability. For this specific problem, we performed a thorough reading on Hbase documentation, and did a lot of tests according to our inquiries, but did not get a satisfactory result.
We are using Hbase as a graph store. On a daily basis, it grows by 1,5 TB/1,5 billion rows. The rows are relatively small size. And the operations are write-intensive.
Our 17 region servers have 256 GB of memory installed on them and we have not observed any bottleneck on servers' resources.
We are using multiple tables, but here are the definitions of tables, which are doing the most of the I/O.
1-
'myGraph:edgeIndices', {TABLE_ATTRIBUTES => {DURABILITY => 'USE_DEFAULT'}, {NAME => 'f', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '131072'}
2-
'myGraph:edges', {TABLE_ATTRIBUTES => {DURABILITY => 'USE_DEFAULT'}, {NAME => 'f', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '131072'}
I attached hbase-config.xml and some server logs(for 2 hours).
Thanks.
Created 06-14-2020 05:22 PM
Did you try change configurations about the compaction throughput?
https://hbase.apache.org/book.html#faq
----------------------------------------------------------------------------------------------------------------------
HBase 2.x comes with default limits to the speed at which compactions can execute. This limit is defined per RegionServer. In previous versions of HBase earlier than 1.5, there was no limit to the speed at which a compaction could run by default. Applying a limit to the throughput of a compaction should ensure more stable operations from RegionServers.
Take care to notice that this limit is per RegionServer, not per compaction.
The throughput limit is defined as a range of bytes written per second, and is allowed to vary within the given lower and upper bound. RegionServers observe the current throughput of a compaction and apply a linear formula to adjust the allowed throughput, within the lower and upper bound, with respect to external pressure. For compactions, external pressure is defined as the number of store files with respect to the maximum number of allowed store files. The more store files, the higher the compaction pressure.
Configuration of this throughput is governed by the following properties.
The lower bound is defined by hbase.hstore.compaction.throughput.lower.bound and defaults to 50 MB/s (52428800).
The upper bound is defined by hbase.hstore.compaction.throughput.higher.bound and defaults to 100 MB/s (104857600).
To revert this behavior to the unlimited compaction throughput of earlier versions of HBase, please set the following property to the implementation that applies no limits to compactions.
hbase.regionserver.throughput.controller=org.apache.hadoop.hbase.regionserver.throttle.NoLimitThroughputController