Support Questions

Find answers, ask questions, and share your expertise

HDFS cluster with replication factor as 1, storing all blocks of a file in one data node. Is it a common behavior?

avatar
Contributor

Hi team,

I have 3 node HDFS cluster with replication factor as 1. I have copied 10GB file in to HDFS with “dfs hdfs -put” command then it was divided into 86 blocks of 128MB each. But all these 86 blocks were stored in one data node. Is it common behaviour?

I expected it to be distribute all 86 blocks across all 3 nodes?

Is there any configuration to do this distribution?

1 REPLY 1

avatar
Super Collaborator

Hi,

Yes this is default behavior (if you're placing the file from within a data node). You can have them distributed by issuing hadoop fs -put command from a client that isn't running DataNode.

According to docs:

* The replica placement strategy is that if the writer is on a datanode,
 * the 1st replica is placed on the local machine, 
 * otherwise a random datanode. The 2nd replica is placed on a datanode
 * that is on a different rack. The 3rd replica is placed on a datanode
 * which is on the same rack as the first replica.