Support Questions
Find answers, ask questions, and share your expertise

Does the 6.3.3 version Cloudera still experiencing small files issue?

Explorer

Small and empty files are recurring on our current version of CDH Cluster. Does is still exist on 6.3.3 version?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Does the 6.3.3 version Cloudera still experiencing small files issue?

Expert Contributor

Hi @Mondi ,

 

Yes, small files can still cause an impact in CDH 6.3.3. This has nothing to do with the version of Cloudera but the way that the Namenode and HDFS interact when a lot of small files are stored in HDFS. Lots of small files create a lot of metadata that the Namenode must store and manage in memory. 

 

To understand more about the impact of small files in HDFS and how to manage this, please refer to this article:

 

https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-c...

 

Regards,

Steve

View solution in original post

2 REPLIES 2

Re: Does the 6.3.3 version Cloudera still experiencing small files issue?

Expert Contributor

Hi @Mondi ,

 

Yes, small files can still cause an impact in CDH 6.3.3. This has nothing to do with the version of Cloudera but the way that the Namenode and HDFS interact when a lot of small files are stored in HDFS. Lots of small files create a lot of metadata that the Namenode must store and manage in memory. 

 

To understand more about the impact of small files in HDFS and how to manage this, please refer to this article:

 

https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-c...

 

Regards,

Steve

View solution in original post

Re: Does the 6.3.3 version Cloudera still experiencing small files issue?

Explorer

Thanks @StevenOD i'll check on this