Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBASE "archive". How to clean? My disk space is vanishing....

avatar
Rising Star

hi! So, I'm the sysadmin of a hadoop cluster. I am not a developer, nor do I "use" it. But... I make sure it's running and happy and secure and... so on.

In reviewing HDFS disk use lately, I noticed our numbers are kinda high.

After some digging, it appears all of the space is going into hbase. OK cool, that's what our developers are doing. Stuffing things in hbase.

But I appear to be losing a bunch of disk space to the hbase "archives" folder. Which is something I assume that hbase is putting stuff in when tables are deleted or...?

I checked with one of our developers, he sees that in the archive there's tables he deleted long ago.
So... my simple question is, how do I clean out unneeded things from the hbase "archive"? I assume manually deleting stuff via hdfs is **not** the way to go.

[hdfs dfs -du -s -h /apps/hbase/data/*
338.6 K /apps/hbase/data/.hbase-snapshot
0 /apps/hbase/data/.tmp
20 /apps/hbase/data/MasterProcWALs
830 /apps/hbase/data/WALs
6.6 T /apps/hbase/data/archive <=== THIS.
0 /apps/hbase/data/corrupt
4.1 T /apps/hbase/data/data
42 /apps/hbase/data/hbase.id
7 /apps/hbase/data/hbase.version
30.7 K /apps/hbase/data/oldWALs

ANY and all help for an hbase newbie would be really appreciated

3 ACCEPTED SOLUTIONS

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Super Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
8 REPLIES 8

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

As far as I can fine, the hbase.master.hfilecleaner.ttl value was not set at all. (does that then mean.. NO cleaning?). I set it to 900000 (15 minutes) and we'll see if anything happens.


avatar
Super Collaborator

Actually that's supposed to be something like 5 minutes by default. So, check whether you have any old snapshots that you don't need anymore.

avatar
Super Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Super Guru
@Kent Brodie

I am assuming you run major compactions probably once a week or some regular schedule. So that is not an issue.

Do you have a lot of snapshots? Here is how snapshots work. When you create a snapshot, it only captures metadata at that point in time. So in case you ever have to restore to that point in time, you restore snapshot. Through metadata that was captured, Snapshot knows which data to restore.

Now, as HBase is running, you might be deleting data. Usually when Major compaction runs, your deleted data is gone for good. Disk space is recovered. However, if you have Snapshots created which are pointing to data that is being deleted, HBase will not delete that data because what if you trying to recover to that particular point in time by restoring the snapshot? So, in that case, the data that snapshot is pointing to is moved to archive folder.

The more Snapshots you have, the more archive folder will grow as needed by Snapshots.

I can only guess, but a reasonable guess of what you are seeing is that you have too many snapshots.

avatar
Rising Star

yup yup yup. Found the snapshots.... guessing THAT is the culprit. Time to have a conversation with the developers.... there's.. a lot.

avatar
New Contributor

@Kent Brodie
Did you get a solution? Please share

avatar
Rising Star
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login