Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How do you specify a highly-available HDFS namespace in an Apache Falcon cluster definition

for the following interfaces:

<interface type="readonly" endpoint="hftp://<host>:50070"/>

<interface type="write" endpoint="hdfs://<host>:8020" />

If we are pointing to a cluster with HDFS H/A enabled?

1 ACCEPTED SOLUTION

Cloudera Employee

There are a couple considerations that need to be taken into account when using NN HA with Falcon and Oozie. In all cases, you need to use the Namenode service ID when referring to the Namenode in the cluster xml. This value can be found in hdfs-site.xml in the property dfs.ha.namenodes.[nameservice ID]. For multi-cluster installs, you need to setup all cluster Namenode HA nameservice ID details in all clusters. For example, if you have two clusters, hdfs-site.xml for both cluster one and cluster two will have 2 nameservice IDs. Likewise, for three clusters, all three clusters would have three nameservice IDs. A two-cluster implementation would look similar to the following:

<property>
  <name>dfs.ha.namenodes.hacluster1</name>
  <value>c1nn1,c1nn2</value>
</property>
<property>
  <name>dfs.ha.namenodes.hacluster2</name>
  <value>c2nn1,c2nn2</value>
</property>

Now, when you setup Falcon, provide both cluster definitions on both clusters.

View solution in original post

5 REPLIES 5

Rising Star

If the nameservice is "myHA", the interfaces should be "hdfs://myHA".

Just to clarify, the cluster in question is different from the one where Falcon is running, i.e. it is a D/R cluster we want to copy data to..

Contributor

You can point to it directly via its address, or you can do as @bvellanki (balu) mentioned, and list its HA. For example, if your HA for your backup cluster is called DRHA, your address would be hdfs://DRHA:8020. See below:

<interface type="readonly" endpoint="hftp://DRHA.company.com:50070" version="2.2.0"/>         
<interface type="write" endpoint="hdfs://DRHA.company.com:8020" version="2.2.0"/> 

#You can also do this, depending on preference

<interface type="readonly" endpoint="hftp://DRHA:50070" version="2.2.0"/>         
<interface type="write" endpoint="hdfs://DRHA:8020" version="2.2.0"/> 

Cloudera Employee

There are a couple considerations that need to be taken into account when using NN HA with Falcon and Oozie. In all cases, you need to use the Namenode service ID when referring to the Namenode in the cluster xml. This value can be found in hdfs-site.xml in the property dfs.ha.namenodes.[nameservice ID]. For multi-cluster installs, you need to setup all cluster Namenode HA nameservice ID details in all clusters. For example, if you have two clusters, hdfs-site.xml for both cluster one and cluster two will have 2 nameservice IDs. Likewise, for three clusters, all three clusters would have three nameservice IDs. A two-cluster implementation would look similar to the following:

<property>
  <name>dfs.ha.namenodes.hacluster1</name>
  <value>c1nn1,c1nn2</value>
</property>
<property>
  <name>dfs.ha.namenodes.hacluster2</name>
  <value>c2nn1,c2nn2</value>
</property>

Now, when you setup Falcon, provide both cluster definitions on both clusters.

Mentor

@dkjerrumgaard has this been resolved? Can you post your solution or accept best answer?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.