Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

For Namenode HA environment, what is the recommended value for dfs.client.retry.policy.enabled and why?

For a cluster with Namenode HA, I want to know what should be the recommended value for :

dfs.client.retry.policy.enabled

The default is false.

I would also like to understand the reasoning behind the value we choose for this property in case of Namenode HA.

2 REPLIES 2

@Dinesh Chitlangia

The property dfs.client.retry.policy.enabled is important when HA is enabled, as it enables HDFS client retry in case of NameNode failure. So, after enabling HDFS HA, the property should be set to true in hdfs-site.xml.

In case, dfs.client.retry.policy.enabled=false in HA environment, then the Namenode connection attempt is made only once and would fail without attempting to connect to failover node. Snippet from code is as below:

+  /**
+   * Return the default retry policy used in RPC.
+   * 
+   * If dfs.client.retry.policy.enabled == false, use TRY_ONCE_THEN_FAIL.
+   * 
+   * Otherwise, first unwrap ServiceException if possible, and then
+   * (1) use multipleLinearRandomRetry for
+   *     - SafeModeException, or
+   *     - IOException other than RemoteException, or
+   *     - ServiceException; and
+   * (2) use TRY_ONCE_THEN_FAIL for
+   *     - non-SafeMode RemoteException, or
+   *     - non-IOException.

Guru

The property

dfs.client.retry.policy.enabled 

should be set to false in cluster with HA enabled.

The reason being in a Namenode High Availability (NN HA) system, when one of the namenodes goes down (NN process stopped), attempts to use hdfs can result in repeating errors and apparent hangs. Running or new jobs that depended on HDFS access will also fail because the failed NN is being talked to.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.