Support Questions

Find answers, ask questions, and share your expertise

HDFS replica + and min data nodes number in the HDFS cluster

avatar

we have Hadoop cluster with only 2 data nodes machines

 

in HDFS configuration we defined the Block replication to 3

 

so

Block replication=3

is it OK? to defined Block replication=3 , when we have only two data nodes in the cluster?

 

from my understanding when we defined Block replication=3 while we have 2 data nodes machines in HDFS cluster

 

its means that one machine should have 2 replica .  and the other machine one replica , am I correct here?

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Rising Star

@mike_bronson7 It is recommended to have minimum 3 data nodes in the cluster to accommodate 3 healthy replicas of a block as the default replication factor is 3. HDFS will not write replicas of same blocks on the same data node. In this scenario there will be under replicated blocks and 2 healthy replicas will be placed on the available 2 data nodes.

View solution in original post

1 REPLY 1

avatar
Rising Star

@mike_bronson7 It is recommended to have minimum 3 data nodes in the cluster to accommodate 3 healthy replicas of a block as the default replication factor is 3. HDFS will not write replicas of same blocks on the same data node. In this scenario there will be under replicated blocks and 2 healthy replicas will be placed on the available 2 data nodes.