Support Questions
Find answers, ask questions, and share your expertise

Ideally what should be the replication factor in Hadoop Cluster ?

Ideally what should be the replication factor in Hadoop Cluster ?

What should be the replication factor in Hadoop Cluster ?

2 REPLIES 2

Re: Ideally what should be the replication factor in Hadoop Cluster ?

Rising Star

It is generally set to 3

Re: Ideally what should be the replication factor in Hadoop Cluster ?

Hi @Himani Bansal,

Hadoop Distributed File System (HDFS) stores files as data blocks and distributes these blocks across the entire cluster.The replication factor is a property that can be set in the HDFS configuration file that will allow you to adjust the global replication factor for the entire cluster.as Pankaj Mentioned the default value of it is 3

when you set replication factor as 3 ,HDFS’s placement policy is to put one replica on one node in the local rack, another on a node in a different (remote) rack, and the last on a different node in the same remote rack.

You can read more about it here : https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#Data+Replication

Please login and accept this answer if this helped you