Support Questions
Find answers, ask questions, and share your expertise

HDFS block count does not decrease after deleting data

Hi,

after having deleted tera bytes of data from HDFS (1/4 of the total capacity), the block count among data nodes did not decrease as expected. It is still over the critical threshold.

How could it be solved?

Thank you

 

1 ACCEPTED SOLUTION

Expert Contributor

Please remember that 1 block is not necessarily 256 MB, it can be less. Also not all files have replica factor of 3, some might have only 1 replica too, so it can be totally fine if all of those were all single replica files.

600.000 * 256 MB = 153.6 TB as a maximum, but since blocks can be smaller than 256 MB, the 60 TB freed up is reasonable.

View solution in original post

13 REPLIES 13

Expert Contributor

Hello @andrea_pretotto , 

This typically happens if you have snapshots on the system. Even though the "current" files are deleted from HDFS, they may be still hold by one ore more snapshots (which are exactly useful against accidental data deletions, as you can recover data from the snapshots if needed).

Please check which HDFS directories are snapshottable:

hdfs lsSnapshottableDir

and then check how many snapshots do you have under those directories:

hdfs dfs -ls /snapshottable_path/.snapshot

Probably you can also verify it with checking the output of the "du" which includes the snapshots' sizes:

hdfs dfs -du -h -v -s /snapshottable_path

vs the same which excludes the snapshots from the calculation:

hdfs dfs -du -x -h -v -s /snapshottable_path

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#du 

 

Best regards

 Miklos

Customer Operations Engineer, Cloudera

Hi Miklos,

thank you for the detailed answer.

I found that the parent of the directory I removed has snapshots enabled, but there are no snapshots.

The command:

hdfs dfs -du -x -h -v -s /snapshottable_path

returns no lines. 

Also the output of "du" is the same.

Should I disable snapshots on the parent directory? Are there other configuration I should apply?

Thank you again.

Expert Contributor

Hi, the "hdfs dfs -du" for that path should return the summary of the disk usage (bytes, kbytes, megabytes, etc..) for that given path. Are you sure there are "no lines returned"? Have you checked the "du" output for a smaller subpath (which has less files underneith), does that return results?

Can you also clarify where have you checked the block count before and after the deletion? ("the block count among data nodes did not decrease as expected")

Hi Miklos,

sorry for the typo.. I have executed the command

hdfs dfs -ls /snapshottable_path/.snapshot

and got no lines on the directory.

The "du" commands ("du -x -h" and "du -h") report the same size.

When I click on the block count alerts on the HDFS service, I can see the number of blocks, which does not decrease. 

The DataNode has 8,743,931 blocks. Critical threshold: 8,000,000 block(s).

Thank you again.

Expert Contributor

Hi Andrea,

Oh,  I see, I did not consider that you see this from the DataNodes' perspective. Was this cluster recently upgraded? Is the "Finalize upgrade" step for HDFS is still pending?

https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdp/topics/ug_cdh_upgrade_hdfs_fi...

While HDFS upgrade is not finalized, DataNodes keep track of all the previous blocks (including blocks deleted after the upgrade) in case a "rollback" is needed.

Mentor

@andrea_pretotto 

Did you use the -skipTrash option during the deletion?  

Hi,

thank you for the replies.

@mszurap no upgrade has been made recently, and there are no pending steps.

@Shelton we kept files on Trash, but after 24h files were deleted.

At HDFS side, the capacity has decreased, but the number of blocks is still high (and does not change).

Thank you again

Expert Contributor

DN should keep files only which are still managed and known by NN. After a huge deletion event of course these "pending deletes" may take some time to be sent to DNs (and the DNs to delete them), but usually that's not that long. Maybe check the "select pending_deletion_blocks" chart if this is applicable.

 

So if the above are not applicable, then check it more deeply with:

- collect a full hdfs fsck -files -blocks -locations output

- pick a DN which you think has more blocks than it should

- verify how many blocks are reported by the hdfs fsck report for that DN

- verify on DN side how many files is it storing - are those numbers matching?

Community Manager

@andrea_pretotto, Has the reply helped resolve your issue? If so, can you please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future?  



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

Hi,

I'm still analyzing the output: the command "fsck" on the path where deleting operations have been made reports just 1 block. 

Looking at the attached chart, you can see that on May, the 19th, a lot of data was removed from hdfs (60TB), and the number of blocks decreased for a single datanode (bda1node02). 

600.000 blocks (1 block -> 256MB).
In the other datanodes, blocks remained the same (or increased slightly). 

blocks.PNG

Expert Contributor

Please remember that 1 block is not necessarily 256 MB, it can be less. Also not all files have replica factor of 3, some might have only 1 replica too, so it can be totally fine if all of those were all single replica files.

600.000 * 256 MB = 153.6 TB as a maximum, but since blocks can be smaller than 256 MB, the 60 TB freed up is reasonable.

Hi @mszurap ,

I agree with you about these numbers. 

Even if 60-100TB is a high amount of data, the total number of blocks involved is not so high (next to 600k), if compared to each Datanode. 

 

Each datanode reports 9M of blocks, but we found the problem is related to other directories that cointain small files, where block size is about 2-3MB. Even if the total size of these directory is not so high, we expect the number of block will decrease more significantly.

 

We are facing the problem of small files, which determines a high number of blocks. The directory we have deleted had larger blocks, which is why the decrease in blocks was imperceptible.

 

Thank you for the support in the analysis!

 

Expert Contributor

Hi Andrea,

Great to see that it has been found now and thanks for marking the post as answered.

All the best, Miklos 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.