Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hbase operation slowly after mass delete

avatar
Explorer

I have a Hbase table on my cluster (about 5M rows). I wrote a Mapreduce job to delete a mass rows of my table. After delete, there 's only about 30K rows in my table. But operations to my table become very slowly, for example: count, scan, disable, ... I looked at Cloudera Manager Dashboard and see the region size of my table still ~6G. And I think it was my problem. Anyone have any idea? Please help me!

2 REPLIES 2

avatar
Explorer

The disk space won't be reclaimed from region files until there's been a major compaction. You can manually trigger that from the hbase web interface. This is a heavy operation if you do it on all regions at once, so be aware it will impact performance while it's running.

avatar
Explorer
Hi ducna, how are you? Regarding your mapreduce job to delete a mass rows in HBase. Can you share this job with me? I've been facing the same problem and I'm looking for some examples, what is the best approach to massively rows from HBase.