Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

New Contributor

-The size of HFiles for an table with 3 lakh enteries is 1.2G,
$ hdfs dfs -du -h /hbase/data/default/testTable/

806 /hbase/data/default/testTable/.tabledesc 0 /hbase/data/default/testTable/.tmp

1.2 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4

1.2 G /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e

- Deleted all the entries in the table and triggered major compaction via hbase client. After this, size of Hfiles are,

$ hdfs dfs -du -h /hbase/data/default/testTable/

806 /hbase/data/default/testTable/.tabledesc

0 /hbase/data/default/testTable/.tmp

127.3 M /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4

217.4 M /hbase/data/default/testTable/21c5017b57212f76672080e8e3f0724e

We could see some of the entries in the HFile that are deleted, though the count of the table is zero. What could be the cause for this scenario.(HFile holding few entries even after deletion and compaction with the size of around 300Mb.)

4 REPLIES 4
Highlighted

Re: What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

can you check that whether the HFiles are holding up that space or recovered.edits

hdfs dfs -du -h /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/*

And, also make sure that you have KEEP_DELETED_CELLS=>'FALSE'

hbase(main):004:0> describe 'testTable' 
Table testTable is ENABLED 
testTable 
COLUMN FAMILIES DESCRIPTION 
{NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

If not , you can turn it off

hbase> hbase> alter ‘t1′, NAME => ‘f1′, KEEP_DELETED_CELLS => true

And also, after deleting the data execute flush 'testTable' before executing the major compaction on the table.

Re: What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

New Contributor

1) recovered.edits is not holding up the space, Clearly HFile is having some data(deleted row keys,columns,column qualifier)

0 /hbase/data/default/testTable/0bce2f3457622bf79f75222e9c3107a4/recovered.edits

2) KEEP_DELETED_CELLS is 'false'
3) executed flush command too.
Still the issue persists.

Re: What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

you can check if all the rows are actually deleted or not by reading a HFile and look for Delete markers.

${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475

Re: What could be the size of HFiles after deleting all the rows in a table and major compaction is triggered?

New Contributor

We have read the Hfile, and it has all the values(key,cf,cq,value).Why the Hfile still retains the data though, all the data has been deleted and even the count of table is zero?
Could you explain how to look for delete markers?

Don't have an account?
Coming from Hortonworks? Activate your account here