Community Articles
Find and share helpful community-sourced technical articles.
Labels (2)

Part 1:

Part 2:

Part 3:

Part 4:

Part 5:

Part 7:

Part 8:

Part 9:

Part 10:

Part 11:

Part 12:

In this tutorial, we're going to leverage Oozie's SLA monitoring features via Workflow Manager. To read more about SLA features in Oozie, please look at the official documentation

We will begin with a simple shell action that will sleep for duration of time. Create a new file called and paste add the code below.

echo “start of script execution”
sleep 60
echo “end of script execution”

We're also going to create a workflow HDFS directory and upload this script to it.

hdfs dfs –mkdir oozie/shell-sla
hdfs dfs -put oozie/shell-sla/

Let's begin with adding a shell action and populating the script name and file attribute.


Don't forget to check the capture output box. We want to see the output of this action.


I want to submit the workflow to make sure everything works as expected before configuring SLA features.


It's a good idea to preview the XML to make sure file tag and exec tags are filled correctly.


Once job completes, I want to drill down to the job and view the output.


Everything looks good, we're ready to enable SLA features of Oozie via Workflow Manager. Click on the shell action and then gear icon. At the bottom of the configuration page, you will see SLA section. Expand that and check the enabled box.


Each field is described as below:

  • nominal-time: As the name suggests, this is the time relative to which your jobs' SLAs will be calculated. Generally since Oozie workflows are aligned with synchronous data dependencies, this nominal time can be parameterized to be passed the value of your coordinator nominal time. Nominal time is also required in case of independent workflows and you can specify the time in which you expect the workflow to be run if you don't have a synchronous dataset associated with it.
  • should-start: Relative to nominal-time this is the amount of time (along with time-unit - MINUTES, HOURS, DAYS) within which your job should start running to meet SLA. This is optional.
  • should-end: Relative to nominal-time this is the amount of time (along with time-unit - MINUTES, HOURS, DAYS) within which your job should finish to meet SLA.
  • max-duration: This is the maximum amount of time (along with time-unit - MINUTES, HOURS, DAYS) your job is expected to run. This is optional.
  • alert-events: Specify the types of events for which Email alerts should be sent. Allowable values in this comma-separated list are start_miss, end_miss and duration_miss. *_met events can generally be deemed low priority and hence email alerting for these is not necessary. However, note that this setting is only for alerts via email alerts and not via JMS messages, where all events send out notifications, and user can filter them using desired selectors. This is optional and only applicable when alert-contact is configured.
  • alert-contact: Specify a comma separated list of email addresses where you wish your alerts to be sent. This is optional and need not be configured if you just want to view your job SLA history in the UI and do not want to receive email alerts.

I'm going to simulate each one of the SLA patterns, i.e. my job started later than scheduled, my job completed outside the SLA threshold and finally, my job took longer to complete than we were expecting. To fill out the nominal time, feel free to choose the date and clock icon below the date picker for correct time. Click x when ready.


Finally, I'd like to change my script to run for 120 seconds instead of 60 to simulate long duration.

My script should look like so:

echo “start of script execution”
sleep 120
echo “end of script execution”

When ready re-upload the script.

At this point, I want to make sure sending mail from the cluster is possible and will test that by sending a sample email. Enabling mail is beyond the scope of this tutorial, I followed the procedure below, adjust as necessary for your environment.

sudo su
yum install postfix
/etc/init.d/postfix restart

Now we're able to send mail from our node, mail needs to work on any of the nodes Oozie will execute a wf.

mail -s "test"

hit ctrl-D, you should get an email shortly.

Finally, there are some changes we need to implement on the Oozie side. I'm not going to enable JMS alerting and only concentrate on email piece. Please consult Oozie docs for JMS part.

This is HDP 2.5.3 and things may look/act differently on your Oozie instance.

Let's go to Ambari > Configs

filter by the following property

We're going to add these services to the existing list:



Once ready, add a couple of more custom properties in Oozie, again, in my environment these properties did not exist.


and the value should be



Oozie docs also recommend adding the following property to improve performance of event processing, we're going to add this property and set value to 15.



Once I saved the changes and restarted Oozie, it failed to start, looking at the logs I noticed the following in the oozie-error.log:

2017-02-15 18:14:32,757  WARN ConfigUtils:523 - SERVER[wfmanager-test-1.openstacklocal] Using a deprecated configuration property [], should use [oozie.service.AuthorizationService.authorization.enabled].  Please delete the deprecated property in order for the new property to take effect.

I found the property in Ambari > Configs and set it to false, I was not able to delete it.


Once done, restart all Oozie services and now you're able to see a new tab in Oozie called SLA


Remember, we only configured Email service, not JMS. We're ready to test our wf. Before that, I'd like to preview the XML for good measure.


At this point, I'm ready to submit the workflow and watch my inbox. I'm expecting to miss my job start, job end and duration. This is an email output of my workflow.




Until next time folks!

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 02:13 PM
Updated by:
Top Kudoed Authors