Expert Contributor
Posts: 69
Registered: ‎11-24-2017
Accepted Solution

Oozie with HDFS High Availability

[ Edited ]

Hello everyone,

I have a cluster with HDFS High Availability (HA) enabled. The cluster has two NameNode, one active and on in standby state, plus 3 journal nodes, a balancer and failover controllers. 


My question: how should I configure Oozie workflows for nameNode and jobTracker parameters in file in order to point always to the active NameNode and JobTracker (in case of a failure or a manual switch of the NameNode)?



Thanks for any information

Posts: 1,695
Kudos: 341
Solutions: 264
Registered: ‎07-31-2013

Re: Oozie with HDFS High Availability

The requirement for Oozie is not different than the general requirement
that after you enable HDFS HA (or YARN HA, etc.), always use the logical
URI everywhere and never directly place/hardcode a NameNode hostname in any
manual configuration.

Oozie as a service carries HDFS client configs that are maintained for it
by CM. These become HA-aware when you complete the HDFS HA wizard. All that
remains is that you submit the new jobs to Oozie with the nameNode and
jobTracker URIs pointing to the logical name (such as hdfs://nameservice1)
instead of the previous single-host/port value.
Expert Contributor
Posts: 69
Registered: ‎11-24-2017

Re: Oozie with HDFS High Availability

Thanks @Harsh J, indeed I've finally solved using hdfs://hanameservice for name node and yarnrm for the job tracker.