Support Questions

Find answers, ask questions, and share your expertise

How to setup High Availability for Oozie?

avatar

To have Oozie server in HA, it is mentioned in the Hortonworks documentation that it needs a Loadbalancer, Virtual IP, or Round-Robin DNS. As this is not part of Hadoop ecosystem, what tool is suggest to use here? HAProxy/nginx/or any other commercial one?

1 ACCEPTED SOLUTION

avatar

Ambari does not manage HA for Oozie yet. Here are some list of manual steps which I recently dug out for someone (AMBARI-6683 is the related JIRA but BUG-13082 has the relevant details you are looking for)

Pasting here:

1) Added oozie-server component, using +Add button on host page.

2) Using apache httpd(using mod_proxy and mod_proxy_balancer), configured load balancing with url liveness check. It means, that returned url for oozie previously checked for availability. We need this, because one of oozie can be unavailable, so load balancer should not return url for it.

3) In oozie-site.xml config:

– add oozie.zookeeper.connection.string = <list of zookeeper hosts with ports> (example: c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181,c6403.ambari.apache.org:2181)

– add these classes "org.apache.oozie.service.ZKLocksService,org.apache.oozie.service.ZKXLogStreamingService,org.apache.oozie.service.ZKJobsConcurrencyService" to property oozie.services.ext.

– change oozie.base.url to http://<loadbalancer_hostname>:11000/oozie

4) In oozie-env.sh config:

– uncomment OOZIE_BASE_URL property and change value to point to the loadbalancer (example of value: http://<loadbalancer_hostname>:11000/oozie)

5) In core-site.xml:

– add host with newly added oozie-server to hadoop.proxyuser.oozie.hosts property. Hosts should be comma separated.

6) Restart all needed services.

Note1: Oozie HA will work only for existing db, because as i know, derby db doesn't support concurrent connections.

View solution in original post

10 REPLIES 10

avatar

Ambari does not manage HA for Oozie yet. Here are some list of manual steps which I recently dug out for someone (AMBARI-6683 is the related JIRA but BUG-13082 has the relevant details you are looking for)

Pasting here:

1) Added oozie-server component, using +Add button on host page.

2) Using apache httpd(using mod_proxy and mod_proxy_balancer), configured load balancing with url liveness check. It means, that returned url for oozie previously checked for availability. We need this, because one of oozie can be unavailable, so load balancer should not return url for it.

3) In oozie-site.xml config:

– add oozie.zookeeper.connection.string = <list of zookeeper hosts with ports> (example: c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181,c6403.ambari.apache.org:2181)

– add these classes "org.apache.oozie.service.ZKLocksService,org.apache.oozie.service.ZKXLogStreamingService,org.apache.oozie.service.ZKJobsConcurrencyService" to property oozie.services.ext.

– change oozie.base.url to http://<loadbalancer_hostname>:11000/oozie

4) In oozie-env.sh config:

– uncomment OOZIE_BASE_URL property and change value to point to the loadbalancer (example of value: http://<loadbalancer_hostname>:11000/oozie)

5) In core-site.xml:

– add host with newly added oozie-server to hadoop.proxyuser.oozie.hosts property. Hosts should be comma separated.

6) Restart all needed services.

Note1: Oozie HA will work only for existing db, because as i know, derby db doesn't support concurrent connections.

avatar
Super Collaborator

guys, do you have any ideas why I don't have option Oozie-server after clicking +Add button? I tried on every host in my cluster. I am using HDP 2.3 with Ambari 2.1.1

avatar
Master Guru

As a quick addon, here is the load balancing configuration using haproxy I used. Seems to work as well and was very easy to setup. Any feedback welcome

enableloadbalancingoozie.txt

avatar

What are tools you suggest for loadbalancing?

avatar
Explorer

@Benjamin Leonhardi How are your logs being recorded on the Oozie servers? Did you setup HAProxy as a transparent proxy? If so, did you account for the X-Forwarded-For header in the Oozie server log config `oozie-log4j.properties` to enable recording the original Oozie client IP in the Oozie logs?

Appreciate your help and comment, thanks!

avatar
Super Collaborator

Follow instructions as per documentation.

http://docs.hortonworks.com/HDPDocuments/Ambari-2....

In addition: In a Kerberized Environment,

  • Create new AD Account for HTTP/<loadbalancer_hostname>@<realm>
  • Append keytab for AD Account into spnego.service.keytab on all hosts running oozie servers referenced by the loadbalancer.
  • Keytabs could be appended as follows --
ktutil 
addent -password -p HTTP/<loadbalancer_hostname>@<realm> -k 1 -e rc4-hmac 
wkt /etc/security/keytabs/spnego.service.keytab 
  • klist -ekt spnego.service.keytab
Keytab name: FILE:spnego.service.keytab 
KVNO Timestamp Principal 
---- ----------------- -------------------------------------------------------- 
... 
1 12/17/15 14:45:02 HTTP/<loadbalancer_hostname>@<realm> (arcfour-hmac) 
  • After keytabs are updated, Restart Oozie service from Ambari UI.

avatar
Contributor
@Saumil Mayani

Have you run into issues with Ambari managing your keytabs, with the manual spnego changes above?

avatar
Super Collaborator
@Matthew Sharp

Yes I have. Ambari would not know about the Load Balancer details and hence would not update / append the HTTP/<loadbalancer_hostname>@<realm> to the spnego.service.keytab

avatar
Contributor

Right, I was more worried about future cluster changes through Ambari or if you regenerate keytabs through Ambari. Guessing in some scenarios that would overwrite your manual updates after you have it working. I will have to review jira to see if there is anything out there to address that.