Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ideally what should be the replication factor in Hadoop Cluster ?

Ideally what should be the replication factor in Hadoop Cluster ?

What should be the replication factor in Hadoop Cluster ?

2 REPLIES 2
Highlighted

Re: Ideally what should be the replication factor in Hadoop Cluster ?

Rising Star

It is generally set to 3

Highlighted

Re: Ideally what should be the replication factor in Hadoop Cluster ?

Hi @Himani Bansal,

Hadoop Distributed File System (HDFS) stores files as data blocks and distributes these blocks across the entire cluster.The replication factor is a property that can be set in the HDFS configuration file that will allow you to adjust the global replication factor for the entire cluster.as Pankaj Mentioned the default value of it is 3

when you set replication factor as 3 ,HDFS’s placement policy is to put one replica on one node in the local rack, another on a node in a different (remote) rack, and the last on a different node in the same remote rack.

You can read more about it here : https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#Data+Replication

Please login and accept this answer if this helped you

Don't have an account?
Coming from Hortonworks? Activate your account here