Support Questions

ducna · ‎07-24-2017

I have a Hbase table on my cluster (about 5M rows). I wrote a Mapreduce job to delete a mass rows of my table. After delete, there 's only about 30K rows in my table. But operations to my table become very slowly, for example: count, scan, disable, ... I looked at Cloudera Manager Dashboard and see the region size of my table still ~6G. And I think it was my problem. Anyone have any idea? Please help me!

john1 · ‎08-04-2017

The disk space won't be reclaimed from region files until there's been a major compaction. You can manually trigger that from the hbase web interface. This is a heavy operation if you do it on all regions at once, so be aware it will impact performance while it's running.

Leo_BR · ‎06-11-2018

Hi ducna, how are you? Regarding your mapreduce job to delete a mass rows in HBase. Can you share this job with me? I've been facing the same problem and I'm looking for some examples, what is the best approach to massively rows from HBase.

Cloudera Community

Support Questions

Hbase operation slowly after mass delete