Created 06-06-2019 06:16 PM
I'm using hbase java api version 1.1.2 and also server version 1.1.2 .I'm firing table.get(List<Get>) where the list contains 60,000 Get operations[list size=60000].When i did bench marking it is taking 6 sec to fetch all the 60000 records from bucket cache .Any one knows the reason why this bulk get is slow in fetching these records from cache.
(my each record size is 48 bytes)
below are my region server settings:
1)hbase region server heap=30GB
2)Block cache size=40% of total heap=12GB
3)Bucket cache size=10GB[off heap-ioengine]
4)Memstore flush size=150mb
5)Memstore size=40 % of total heap=12GB
6)Max HFile size=1GB
7number of hfiles for compaction =10
8)number of handler threads=35
below are some other details
1)Total load 300 million
2)Number of regions:10
3)Number of Hfiles :15
4)Total size of HFiles : 4.72GB
below is the table description
COLUMN FAMILIES DESCRIPTION
{NAME => 'CF1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIF
F', TTL => '3024000 SECONDS (35 DAYS)', COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SC
OPE => '0'}
below are my bench marking details
Records | Time in millis |
1 | 1 |
10 | 1 |
100 | 5 |
1000 | 44 |
10000 | 300 |
60000 | 8148 |
Thanks,
Mani
Created 06-06-2019 06:54 PM
Please share the code you are using to run this benchmark. If it is using some sensitive data, please reproduce it using non-sensitive data. Without seeing how you are executing the timings, it's near impossible to give any meaningful advice.
Created 06-07-2019 10:37 AM
hi below is the my code to calculate the time for bulk get.
Long startTime = System.currentTimeMillis()
Result[] results = table.get(getOperationList1);
totalgetTime = (System.currentTimeMillis() - startTime);
logger.info("get time "+totalgetTime+ "for records "+getOperationList1.size());
my row key is a byte array of length 24 and my value object length is 48 bytes