Support Questions

andrea_pretotto · ‎05-30-2022

Hi,

after having deleted tera bytes of data from HDFS (1/4 of the total capacity), the block count among data nodes did not decrease as expected. It is still over the critical threshold.

How could it be solved?

Thank you

mszurap · ‎06-08-2022

Please remember that 1 block is not necessarily 256 MB, it can be less. Also not all files have replica factor of 3, some might have only 1 replica too, so it can be totally fine if all of those were all single replica files.

600.000 * 256 MB = 153.6 TB as a maximum, but since blocks can be smaller than 256 MB, the 60 TB freed up is reasonable.

View solution in original post

mszurap · ‎05-30-2022

Hello @andrea_pretotto ,

This typically happens if you have snapshots on the system. Even though the "current" files are deleted from HDFS, they may be still hold by one ore more snapshots (which are exactly useful against accidental data deletions, as you can recover data from the snapshots if needed).

Please check which HDFS directories are snapshottable:

hdfs lsSnapshottableDir

and then check how many snapshots do you have under those directories:

hdfs dfs -ls /snapshottable_path/.snapshot

Probably you can also verify it with checking the output of the "du" which includes the snapshots' sizes:

hdfs dfs -du -h -v -s /snapshottable_path

vs the same which excludes the snapshots from the calculation:

hdfs dfs -du -x -h -v -s /snapshottable_path

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#du

Best regards

Miklos

Customer Operations Engineer, Cloudera

andrea_pretotto · ‎05-31-2022

Hi Miklos,

thank you for the detailed answer.

I found that the parent of the directory I removed has snapshots enabled, but there are no snapshots.

The command:

hdfs dfs -du -x -h -v -s /snapshottable_path

returns no lines.

Also the output of "du" is the same.

Should I disable snapshots on the parent directory? Are there other configuration I should apply?

Thank you again.

mszurap · ‎05-31-2022

Hi, the "hdfs dfs -du" for that path should return the summary of the disk usage (bytes, kbytes, megabytes, etc..) for that given path. Are you sure there are "no lines returned"? Have you checked the "du" output for a smaller subpath (which has less files underneith), does that return results?

Can you also clarify where have you checked the block count before and after the deletion? ("the block count among data nodes did not decrease as expected")

andrea_pretotto · ‎05-31-2022

Hi Miklos,

sorry for the typo.. I have executed the command

hdfs dfs -ls /snapshottable_path/.snapshot

and got no lines on the directory.

The "du" commands ("du -x -h" and "du -h") report the same size.

When I click on the block count alerts on the HDFS service, I can see the number of blocks, which does not decrease.

The DataNode has 8,743,931 blocks. Critical threshold: 8,000,000 block(s).

Thank you again.

mszurap · ‎05-31-2022

Hi Andrea,

Oh, I see, I did not consider that you see this from the DataNodes' perspective. Was this cluster recently upgraded? Is the "Finalize upgrade" step for HDFS is still pending?

https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdp/topics/ug_cdh_upgrade_hdfs_fi...

While HDFS upgrade is not finalized, DataNodes keep track of all the previous blocks (including blocks deleted after the upgrade) in case a "rollback" is needed.

Shelton · ‎05-31-2022

@andrea_pretotto

Did you use the -skipTrash option during the deletion?

andrea_pretotto · ‎06-01-2022

Hi,

thank you for the replies.

@mszurap no upgrade has been made recently, and there are no pending steps.

@Shelton we kept files on Trash, but after 24h files were deleted.

At HDFS side, the capacity has decreased, but the number of blocks is still high (and does not change).

Thank you again

mszurap · ‎06-01-2022

DN should keep files only which are still managed and known by NN. After a huge deletion event of course these "pending deletes" may take some time to be sent to DNs (and the DNs to delete them), but usually that's not that long. Maybe check the "select pending_deletion_blocks" chart if this is applicable.

So if the above are not applicable, then check it more deeply with:

- collect a full hdfs fsck -files -blocks -locations output

- pick a DN which you think has more blocks than it should

- verify how many blocks are reported by the hdfs fsck report for that DN

- verify on DN side how many files is it storing - are those numbers matching?

VidyaSargur · ‎06-05-2022

@andrea_pretotto, Has the reply helped resolve your issue? If so, can you please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future?

Regards,

Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Support Questions

HDFS block count does not decrease after deleting data