04-25-2017 08:59 AM
i face to problem with table size on hdfs. i need to reduce it. i check the size by next command:
hadoop fs -du -s -h /hbase/archive/data/default/*
23.9 M 71.6 M /hbase/archive/data/default/TEST
i checked the table by org.apache.hadoop.hbase.mapreduce.RowCounter and saw that there are almost 700k records, most of them very old. So, i set TTL for the table for 1 month. After that RowCounter shows 300k records.
I've read that to drop old TTL record there should be run major compaction. i did it for the current table for several times, but the size on hdfs didn't change.
The main question is why it so? and what can i do to decrease the size of table on hdfs?
Thank you for answer in advance.
04-25-2017 12:08 PM
There could be multiple reasons and this is one of them
The data that you have deleted might be moved to .Trash folder and exist for 24 hours by default
So you will not find the size difference for 24 hours after delete
So please check "/hbase/archive/data/default/.Trash" folder and if you have data in it either wait for 24 hours (if there is no change in default setting) or delete the data from .Trash folder to get the immediate result
04-26-2017 07:35 AM
Thank you @saranvisa for answer!
but unfortunately my Trash is at /user/hdfs/.Trash and it is empty
hadoop fs -du -s -h /hbase/archive/data/default/.Trash
du: `/hbase/archive/data/default/.Trash': No such file or directory
What can be other reasons? What can i check additionally?
Thank you in advance.