Support Questions

Find answers, ask questions, and share your expertise

HDFS small files

avatar
New Contributor

In CDH 6.3 environment, HDFS service having block count warning alerts for certain data nodes.
HDFS rebalance is triggered few times but it does not help much.
This alert could be due to small files stored in some HDFS locations.

 

What is the effective way to identify small files and its location for housekeeping cleanup?

1 REPLY 1

avatar
New Contributor

Hello, I leave an interesting article that talks about the different challenges of small files and how to compact them.

https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-c...