Created 09-13-2017 07:57 AM
-The size of HFiles for an table with 3 lakh enteries is 1.2G,
$ hdfs dfs -du -h /hbase/data/default/testTable/
806 /hbase/data/default/testTable/.tabledesc 0 /hbase/data/default/testTable/.tmp
1.2 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4
1.2 G /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e
- Deleted all the entries in the table and triggered major compaction via hbase client. After this, size of Hfiles are,
$ hdfs dfs -du -h /hbase/data/default/testTable/
806 /hbase/data/default/testTable/.tabledesc
0 /hbase/data/default/testTable/.tmp
127.3 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4
217.4 M /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e
We could see some of the entries in the HFile that are deleted, though the count of the table is zero. What could be the cause for this scenario.(HFile holding few entries even after deletion and compaction with the size of around 300Mb.)
Created 09-13-2017 08:24 AM
can you check that whether the HFiles are holding up that space or recovered.edits
hdfs dfs -du -h /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/*
And, also make sure that you have KEEP_DELETED_CELLS=>'FALSE'
hbase(main):004:0> describe 'testTable' Table testTable is ENABLED testTable COLUMN FAMILIES DESCRIPTION {NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
If not , you can turn it off
hbase> hbase> alter ‘t1′, NAME => ‘f1′, KEEP_DELETED_CELLS => true
And also, after deleting the data execute flush 'testTable' before executing the major compaction on the table.
Created 09-13-2017 11:13 AM
1) recovered.edits is not holding up the space, Clearly HFile is having some data(deleted row keys,columns,column qualifier)
0 /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/recovered.edits
2) KEEP_DELETED_CELLS is 'false'
3) executed flush command too.
Still the issue persists.
Created 09-13-2017 11:30 AM
you can check if all the rows are actually deleted or not by reading a HFile and look for Delete markers.
${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475
Created 09-13-2017 12:50 PM
We have read the Hfile, and it has all the values(key,cf,cq,value).Why the Hfile still retains the data though, all the data has been deleted and even the count of table is zero?
Could you explain how to look for delete markers?