- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Where did my space on Hadoop cluster go?
- Labels:
-
Apache Hive
-
HDFS
Created on ‎12-11-2018 03:03 AM - edited ‎09-16-2022 06:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've a very weired issue, where my hadoop cluster has run out of space. Upon investagtion I found out that one of the database was consuming about 77 TB of space. However when I go inside the directory the total space consumed by all tables is about 5TB. So what is consuming the rest of the space or where did it go?
I'm finding space using the following command:
hadoop fs -du -h /user/hive/warehouse
My cloudera manager is 5.13
Created ‎12-15-2018 11:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So the problem was with Snapshots. I had configured snapshots a long time ago on the /user/hive/warehouse directory, and they were still being generated.
I was finding the space using the commands
hadoop fs -du -h /user/hive
hadoop fs -du -h /user/hive/warehouse
Snapshot directories can be found using command:
hdfs lsSnapshottabledir
hadoop fs -delteSnapshot <path without .snapshot> <snapshotname>
Created ‎12-11-2018 07:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you using Cloudera Enterprise by any chance? if so, you can generate report from CM -> Clusters (top menu) -> Reports -> Directory usage
For more details, pls refer
Created ‎12-11-2018 11:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎12-14-2018 02:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One thing that would help us provide some more suggestions is to understand the following:
- How you came to know that your "hadoop cluster ran out of space". What did you see exactly that told you there was a problem?
- What did you run to see that a database was using 77TB? What was the ouptput?
- What command did you run to see that only 5TB was of table data was taken? What was the output?
Created ‎12-15-2018 11:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So the problem was with Snapshots. I had configured snapshots a long time ago on the /user/hive/warehouse directory, and they were still being generated.
I was finding the space using the commands
hadoop fs -du -h /user/hive
hadoop fs -du -h /user/hive/warehouse
Snapshot directories can be found using command:
hdfs lsSnapshottabledir
hadoop fs -delteSnapshot <path without .snapshot> <snapshotname>
Created ‎12-11-2018 09:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
