Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does Cloudera CDH4 enforce a minimum HDFS block size?

Does Cloudera CDH4 enforce a minimum HDFS block size?

I am running some tests using compressed files and small block sizes.

 

I tried to set the dfs.block.size through Cloudera Manager to 8Mb, and received the error: "8388608 less than 16777216." Is 16Mb a hard minimum to the dfs.block.size parameter, or is there potentially another setting that conflicts with dfs.block.size?

1 REPLY 1

Re: Does Cloudera CDH4 enforce a minimum HDFS block size?

Master Guru
Yes, we do enforce against setting it cluster-wide as it defeats the purpose of using HDFS if your large files have too many blocks.

However, if your use-case is forcing you into doing this for whatever reason, it is better if you apply the custom blocksize just for your specific set of files.

The block size can be controlled by each client. For example, you can create files on HDFS using an 8 MB blocksize by doing the below with 'hadoop fs':

~> hadoop fs -Ddfs.blocksize=8M -put localfile hdfsfile

Likewise, the API also allows you to set block sizes arbitrarily (they must be a multiple of 512 bytes, however): http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/org/apache/hadoop/fs/FileSystem.html#create(org.ap... boolean, int, short, long)

The Cloudera Manager field is used to set cluster-wide defaults, which we recommend keeping at a more sane value that can generically apply to all forms of use-cases. This is why the configuration validation exists, to ensure you do not accidentally impact other users/applications.