- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Data directories mount point space not reclaimed
- Labels:
-
HDFS
Created 11-04-2022 04:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Team,
We have 3 DN's and Data Directories mount point size is around 1TB on each Node , total Data Directories size was 3TB. I deleted 500GB data from HDFS but space got release from only DN3 Data Directory not from other DN's.
Please advise..
Created 11-14-2022 07:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running
hdfs dfs -ls -R / > hdfs_recursive
After you can try a filter to this file to know the RF factor applied to your files:
hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c
Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.
You can get also the following outputs to compare if existing data is protected by snapshots or not:
#1 du -s -x
#2 du -s
---> Du command usage [1]
If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes.
Created 11-04-2022 07:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @hanumanth,
Could you please check below things:
1) If the replication factor of deleted files was 1.
2) If there are blocks still pending to be deleted. This could be checked from NN UI
3) If there are hdfs snapshots configured on the deleted paths or its parent directory
Thank you
Created 11-14-2022 05:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Pajoshi,
Please find my inline comments.
1) If the replication factor of deleted files was 1.
-->Replication factor - 2
2) If there are blocks still pending to be deleted. This could be checked from NN UI
->we observed 6 blocks pending to be deleted.
3) If there are hdfs snapshots configured on the deleted paths or its parent directory
--> We deleted hdfs snapshots
Created 11-07-2022 12:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 11-14-2022 05:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
not yet issue didn't resolved
Created 11-14-2022 07:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running
hdfs dfs -ls -R / > hdfs_recursive
After you can try a filter to this file to know the RF factor applied to your files:
hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c
Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.
You can get also the following outputs to compare if existing data is protected by snapshots or not:
#1 du -s -x
#2 du -s
---> Du command usage [1]
If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes.
Created 11-17-2022 03:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
