Created 11-04-2022 04:31 AM
Hi Team,
We have 3 DN's and Data Directories mount point size is around 1TB on each Node , total Data Directories size was 3TB. I deleted 500GB data from HDFS but space got release from only DN3 Data Directory not from other DN's.
Please advise..
Created 11-14-2022 07:14 AM
@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running
hdfs dfs -ls -R / > hdfs_recursive
After you can try a filter to this file to know the RF factor applied to your files:
hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c
Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.
You can get also the following outputs to compare if existing data is protected by snapshots or not:
#1 du -s -x
#2 du -s
---> Du command usage [1]
If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes.
Created 11-04-2022 07:10 AM
Hi @hanumanth,
Could you please check below things:
1) If the replication factor of deleted files was 1.
2) If there are blocks still pending to be deleted. This could be checked from NN UI
3) If there are hdfs snapshots configured on the deleted paths or its parent directory
Thank you
Created 11-14-2022 05:37 AM
Hi Pajoshi,
Please find my inline comments.
1) If the replication factor of deleted files was 1.
-->Replication factor - 2
2) If there are blocks still pending to be deleted. This could be checked from NN UI
->we observed 6 blocks pending to be deleted.
3) If there are hdfs snapshots configured on the deleted paths or its parent directory
--> We deleted hdfs snapshots
Created 11-07-2022 12:56 PM
@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Diana Torres,Created 11-14-2022 05:35 AM
not yet issue didn't resolved
Created 11-14-2022 07:14 AM
@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running
hdfs dfs -ls -R / > hdfs_recursive
After you can try a filter to this file to know the RF factor applied to your files:
hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c
Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.
You can get also the following outputs to compare if existing data is protected by snapshots or not:
#1 du -s -x
#2 du -s
---> Du command usage [1]
If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes.
Created 11-17-2022 03:42 PM
@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Diana Torres,