Created on 12-11-2018 03:03 AM - edited 09-16-2022 06:58 AM
I've a very weired issue, where my hadoop cluster has run out of space. Upon investagtion I found out that one of the database was consuming about 77 TB of space. However when I go inside the directory the total space consumed by all tables is about 5TB. So what is consuming the rest of the space or where did it go?
I'm finding space using the following command:
hadoop fs -du -h /user/hive/warehouse
My cloudera manager is 5.13
Created 12-15-2018 11:57 PM
So the problem was with Snapshots. I had configured snapshots a long time ago on the /user/hive/warehouse directory, and they were still being generated.
I was finding the space using the commands
hadoop fs -du -h /user/hive
hadoop fs -du -h /user/hive/warehouse
Snapshot directories can be found using command:
hdfs lsSnapshottabledir
hadoop fs -delteSnapshot <path without .snapshot> <snapshotname>
Created 12-11-2018 07:29 AM
Are you using Cloudera Enterprise by any chance? if so, you can generate report from CM -> Clusters (top menu) -> Reports -> Directory usage
For more details, pls refer
Created 12-11-2018 11:27 PM
Created 12-14-2018 02:58 PM
One thing that would help us provide some more suggestions is to understand the following:
Created 12-15-2018 11:57 PM
So the problem was with Snapshots. I had configured snapshots a long time ago on the /user/hive/warehouse directory, and they were still being generated.
I was finding the space using the commands
hadoop fs -du -h /user/hive
hadoop fs -du -h /user/hive/warehouse
Snapshot directories can be found using command:
hdfs lsSnapshottabledir
hadoop fs -delteSnapshot <path without .snapshot> <snapshotname>
Created 12-11-2018 09:06 AM