Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

Explorer

-The size of HFiles for an table with 3 lakh enteries is 1.2G,
$ hdfs dfs -du -h /hbase/data/default/testTable/

806 /hbase/data/default/testTable/.tabledesc 0 /hbase/data/default/testTable/.tmp

1.2 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4

1.2 G /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e

- Deleted all the entries in the table and triggered major compaction via hbase client. After this, size of Hfiles are,

$ hdfs dfs -du -h /hbase/data/default/testTable/

806 /hbase/data/default/testTable/.tabledesc

0 /hbase/data/default/testTable/.tmp

127.3 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4

217.4 M /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e

We could see some of the entries in the HFile that are deleted, though the count of the table is zero. What could be the cause for this scenario.(HFile holding few entries even after deletion and compaction with the size of around 300Mb.)

4 REPLIES 4

can you check that whether the HFiles are holding up that space or recovered.edits

hdfs dfs -du -h /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/*

And, also make sure that you have KEEP_DELETED_CELLS=>'FALSE'

hbase(main):004:0> describe 'testTable' 
Table testTable is ENABLED 
testTable 
COLUMN FAMILIES DESCRIPTION 
{NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

If not , you can turn it off

hbase> hbase> alter ‘t1′, NAME => ‘f1′, KEEP_DELETED_CELLS => true

And also, after deleting the data execute flush 'testTable' before executing the major compaction on the table.

Explorer

1) recovered.edits is not holding up the space, Clearly HFile is having some data(deleted row keys,columns,column qualifier)

0 /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/recovered.edits

2) KEEP_DELETED_CELLS is 'false'
3) executed flush command too.
Still the issue persists.

you can check if all the rows are actually deleted or not by reading a HFile and look for Delete markers.

${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475

Explorer

We have read the Hfile, and it has all the values(key,cf,cq,value).Why the Hfile still retains the data though, all the data has been deleted and even the count of table is zero?
Could you explain how to look for delete markers?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.