Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hbase operation slowly after mass delete


I have a Hbase table on my cluster (about 5M rows). I wrote a Mapreduce job to delete a mass rows of my table. After delete, there 's only about 30K rows in my table. But operations to my table become very slowly, for example: count, scan, disable, ... I looked at Cloudera Manager Dashboard and see the region size of my table still ~6G. And I think it was my problem. Anyone have any idea? Please help me!



The disk space won't be reclaimed from region files until there's been a major compaction. You can manually trigger that from the hbase web interface. This is a heavy operation if you do it on all regions at once, so be aware it will impact performance while it's running.

Hi ducna, how are you? Regarding your mapreduce job to delete a mass rows in HBase. Can you share this job with me? I've been facing the same problem and I'm looking for some examples, what is the best approach to massively rows from HBase.
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.