Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to setup High Availability for Oozie?

Solved Go to solution

How to setup High Availability for Oozie?

To have Oozie server in HA, it is mentioned in the Hortonworks documentation that it needs a Loadbalancer, Virtual IP, or Round-Robin DNS. As this is not part of Hadoop ecosystem, what tool is suggest to use here? HAProxy/nginx/or any other commercial one?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to setup High Availability for Oozie?

Ambari does not manage HA for Oozie yet. Here are some list of manual steps which I recently dug out for someone (AMBARI-6683 is the related JIRA but BUG-13082 has the relevant details you are looking for)

Pasting here:

1) Added oozie-server component, using +Add button on host page.

2) Using apache httpd(using mod_proxy and mod_proxy_balancer), configured load balancing with url liveness check. It means, that returned url for oozie previously checked for availability. We need this, because one of oozie can be unavailable, so load balancer should not return url for it.

3) In oozie-site.xml config:

– add oozie.zookeeper.connection.string = <list of zookeeper hosts with ports> (example: c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181,c6403.ambari.apache.org:2181)

– add these classes "org.apache.oozie.service.ZKLocksService,org.apache.oozie.service.ZKXLogStreamingService,org.apache.oozie.service.ZKJobsConcurrencyService" to property oozie.services.ext.

– change oozie.base.url to http://<loadbalancer_hostname>:11000/oozie

4) In oozie-env.sh config:

– uncomment OOZIE_BASE_URL property and change value to point to the loadbalancer (example of value: http://<loadbalancer_hostname>:11000/oozie)

5) In core-site.xml:

– add host with newly added oozie-server to hadoop.proxyuser.oozie.hosts property. Hosts should be comma separated.

6) Restart all needed services.

Note1: Oozie HA will work only for existing db, because as i know, derby db doesn't support concurrent connections.

10 REPLIES 10

Re: How to setup High Availability for Oozie?

Ambari does not manage HA for Oozie yet. Here are some list of manual steps which I recently dug out for someone (AMBARI-6683 is the related JIRA but BUG-13082 has the relevant details you are looking for)

Pasting here:

1) Added oozie-server component, using +Add button on host page.

2) Using apache httpd(using mod_proxy and mod_proxy_balancer), configured load balancing with url liveness check. It means, that returned url for oozie previously checked for availability. We need this, because one of oozie can be unavailable, so load balancer should not return url for it.

3) In oozie-site.xml config:

– add oozie.zookeeper.connection.string = <list of zookeeper hosts with ports> (example: c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181,c6403.ambari.apache.org:2181)

– add these classes "org.apache.oozie.service.ZKLocksService,org.apache.oozie.service.ZKXLogStreamingService,org.apache.oozie.service.ZKJobsConcurrencyService" to property oozie.services.ext.

– change oozie.base.url to http://<loadbalancer_hostname>:11000/oozie

4) In oozie-env.sh config:

– uncomment OOZIE_BASE_URL property and change value to point to the loadbalancer (example of value: http://<loadbalancer_hostname>:11000/oozie)

5) In core-site.xml:

– add host with newly added oozie-server to hadoop.proxyuser.oozie.hosts property. Hosts should be comma separated.

6) Restart all needed services.

Note1: Oozie HA will work only for existing db, because as i know, derby db doesn't support concurrent connections.

Re: How to setup High Availability for Oozie?

Expert Contributor

guys, do you have any ideas why I don't have option Oozie-server after clicking +Add button? I tried on every host in my cluster. I am using HDP 2.3 with Ambari 2.1.1

Re: How to setup High Availability for Oozie?

As a quick addon, here is the load balancing configuration using haproxy I used. Seems to work as well and was very easy to setup. Any feedback welcome

enableloadbalancingoozie.txt

Re: How to setup High Availability for Oozie?

What are tools you suggest for loadbalancing?

Highlighted

Re: How to setup High Availability for Oozie?

New Contributor

@Benjamin Leonhardi How are your logs being recorded on the Oozie servers? Did you setup HAProxy as a transparent proxy? If so, did you account for the X-Forwarded-For header in the Oozie server log config `oozie-log4j.properties` to enable recording the original Oozie client IP in the Oozie logs?

Appreciate your help and comment, thanks!

Re: How to setup High Availability for Oozie?

Expert Contributor

Follow instructions as per documentation.

http://docs.hortonworks.com/HDPDocuments/Ambari-2....

In addition: In a Kerberized Environment,

  • Create new AD Account for HTTP/<loadbalancer_hostname>@<realm>
  • Append keytab for AD Account into spnego.service.keytab on all hosts running oozie servers referenced by the loadbalancer.
  • Keytabs could be appended as follows --
ktutil 
addent -password -p HTTP/<loadbalancer_hostname>@<realm> -k 1 -e rc4-hmac 
wkt /etc/security/keytabs/spnego.service.keytab 
  • klist -ekt spnego.service.keytab
Keytab name: FILE:spnego.service.keytab 
KVNO Timestamp Principal 
---- ----------------- -------------------------------------------------------- 
... 
1 12/17/15 14:45:02 HTTP/<loadbalancer_hostname>@<realm> (arcfour-hmac) 
  • After keytabs are updated, Restart Oozie service from Ambari UI.

Re: How to setup High Availability for Oozie?

New Contributor
@Saumil Mayani

Have you run into issues with Ambari managing your keytabs, with the manual spnego changes above?

Re: How to setup High Availability for Oozie?

Expert Contributor
@Matthew Sharp

Yes I have. Ambari would not know about the Load Balancer details and hence would not update / append the HTTP/<loadbalancer_hostname>@<realm> to the spnego.service.keytab

Re: How to setup High Availability for Oozie?

New Contributor

Right, I was more worried about future cluster changes through Ambari or if you regenerate keytabs through Ambari. Guessing in some scenarios that would overwrite your manual updates after you have it working. I will have to review jira to see if there is anything out there to address that.