Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Oozie with HDFS High Availability

avatar
Expert Contributor

Hello everyone,

I have a cluster with HDFS High Availability (HA) enabled. The cluster has two NameNode, one active and on in standby state, plus 3 journal nodes, a balancer and failover controllers. 

 

My question: how should I configure Oozie workflows for nameNode and jobTracker parameters in job.properties file in order to point always to the active NameNode and JobTracker (in case of a failure or a manual switch of the NameNode)?

 

 

Thanks for any information

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Thanks @Harsh J, indeed I've finally solved using hdfs://hanameservice for name node and yarnrm for the job tracker. 

 

 

View solution in original post

2 REPLIES 2

avatar
Mentor
The requirement for Oozie is not different than the general requirement
that after you enable HDFS HA (or YARN HA, etc.), always use the logical
URI everywhere and never directly place/hardcode a NameNode hostname in any
manual configuration.

Oozie as a service carries HDFS client configs that are maintained for it
by CM. These become HA-aware when you complete the HDFS HA wizard. All that
remains is that you submit the new jobs to Oozie with the nameNode and
jobTracker URIs pointing to the logical name (such as hdfs://nameservice1)
instead of the previous single-host/port value.

avatar
Expert Contributor

Thanks @Harsh J, indeed I've finally solved using hdfs://hanameservice for name node and yarnrm for the job tracker.