- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Oozie with HDFS High Availability
- Labels:
-
Apache Oozie
-
HDFS
Created on ‎05-10-2018 03:06 AM - edited ‎09-16-2022 06:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
I have a cluster with HDFS High Availability (HA) enabled. The cluster has two NameNode, one active and on in standby state, plus 3 journal nodes, a balancer and failover controllers.
My question: how should I configure Oozie workflows for nameNode and jobTracker parameters in job.properties file in order to point always to the active NameNode and JobTracker (in case of a failure or a manual switch of the NameNode)?
Thanks for any information
Created ‎05-14-2018 10:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Harsh J, indeed I've finally solved using hdfs://hanameservice for name node and yarnrm for the job tracker.
Created ‎05-14-2018 09:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
that after you enable HDFS HA (or YARN HA, etc.), always use the logical
URI everywhere and never directly place/hardcode a NameNode hostname in any
manual configuration.
Oozie as a service carries HDFS client configs that are maintained for it
by CM. These become HA-aware when you complete the HDFS HA wizard. All that
remains is that you submit the new jobs to Oozie with the nameNode and
jobTracker URIs pointing to the logical name (such as hdfs://nameservice1)
instead of the previous single-host/port value.
Created ‎05-14-2018 10:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Harsh J, indeed I've finally solved using hdfs://hanameservice for name node and yarnrm for the job tracker.
