Support Questions

Find answers, ask questions, and share your expertise

Data directories mount point space not reclaimed

avatar
Contributor

Hi Team,

 

We have 3 DN's and Data Directories mount point size is around 1TB on each Node , total Data Directories size was 3TB. I deleted 500GB data from HDFS but space got release from only DN3 Data Directory not from other DN's.

 

Please advise..

 

1 ACCEPTED SOLUTION

avatar
Contributor

@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running

 

hdfs dfs -ls -R / > hdfs_recursive

 

After you can try a filter to this file to know the RF factor applied to your files: 

 

hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c

 

Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.

 

You can get also the following outputs to compare if existing data is protected by snapshots or not:

 

#1 du -s -x

#2 du -s

 

---> Du command usage [1

 

If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes. 

 

View solution in original post

6 REPLIES 6

avatar
Rising Star

Hi @hanumanth,

 

Could you please check below things:

1) If the replication factor of deleted files was 1.

2) If there are blocks still pending to be deleted. This could be checked from NN UI

3) If there are hdfs snapshots configured on the deleted paths or its parent directory

 

Thank you

avatar
Contributor

Hi Pajoshi,

 

Please find my inline comments.

 

1) If the replication factor of deleted files was 1.

 

-->Replication factor - 2

2) If there are blocks still pending to be deleted. This could be checked from NN UI

->we observed 6 blocks pending to be deleted.

3) If there are hdfs snapshots configured on the deleted paths or its parent directory

--> We deleted hdfs snapshots

 

avatar
Community Manager

@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. 


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Contributor

not yet issue didn't resolved

avatar
Contributor

@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running

 

hdfs dfs -ls -R / > hdfs_recursive

 

After you can try a filter to this file to know the RF factor applied to your files: 

 

hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c

 

Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks.

 

You can get also the following outputs to compare if existing data is protected by snapshots or not:

 

#1 du -s -x

#2 du -s

 

---> Du command usage [1

 

If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes. 

 

avatar
Community Manager

@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: