Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3484 | 05-03-2017 05:13 PM | |
| 2862 | 05-02-2017 08:38 AM | |
| 3123 | 05-02-2017 08:13 AM | |
| 3087 | 04-10-2017 10:51 PM | |
| 1578 | 03-28-2017 02:27 AM |
02-15-2017
07:54 PM
2 Kudos
Part 1: https://community.hortonworks.com/articles/82964/getting-started-with-apache-ambari-workflow-design.html Part 2: https://community.hortonworks.com/articles/82967/apache-ambari-workflow-designer-view-for-apache-oo.html Part 3: https://community.hortonworks.com/articles/82988/apache-ambari-workflow-designer-view-for-apache-oo-1.html Part 4: https://community.hortonworks.com/articles/83051/apache-ambari-workflow-designer-view-for-apache-oo-2.html Part 5: https://community.hortonworks.com/articles/83361/apache-ambari-workflow-manager-view-for-apache-ooz.html Part 7: https://community.hortonworks.com/articles/84071/apache-ambari-workflow-manager-view-for-apache-ooz-2.html Part 8: https://community.hortonworks.com/articles/84394/apache-ambari-workflow-manager-view-for-apache-ooz-3.html Part 9: https://community.hortonworks.com/articles/85091/apache-ambari-workflow-manager-view-for-apache-ooz-4.html Part 10: https://community.hortonworks.com/articles/85354/apache-ambari-workflow-manager-view-for-apache-ooz-5.html Part 11: https://community.hortonworks.com/articles/85361/apache-ambari-workflow-manager-view-for-apache-ooz-6.html Part 12: https://community.hortonworks.com/articles/131389/apache-ambari-workflow-manager-view-for-apache-ooz-7.html In this tutorial, we're going to leverage Oozie's SLA monitoring features via Workflow Manager. To read more about SLA features in Oozie, please look at the official documentation https://oozie.apache.org/docs/4.2.0/DG_SLAMonitoring.html We will begin with a simple shell action that will sleep for duration of time. Create a new file called script.sh and paste add the code below. echo “start of script execution”
sleep 60
echo “end of script execution”
We're also going to create a workflow HDFS directory and upload this script to it. hdfs dfs –mkdir oozie/shell-sla
hdfs dfs -put script.sh oozie/shell-sla/
Let's begin with adding a shell action and populating the script name and file attribute. Don't forget to check the capture output box. We want to see the output of this action. I want to submit the workflow to make sure everything works as expected before configuring SLA features. It's a good idea to preview the XML to make sure file tag and exec tags are filled correctly. Once job completes, I want to drill down to the job and view the output. Everything looks good, we're ready to enable SLA features of Oozie via Workflow Manager. Click on the shell action and then gear icon. At the bottom of the configuration page, you will see SLA section. Expand that and check the enabled box. Each field is described as below:
nominal-time: As the name suggests, this is
the time relative to which your jobs' SLAs will be calculated. Generally
since Oozie workflows are aligned with synchronous data dependencies, this
nominal time can be parameterized to be passed the value of your
coordinator nominal time. Nominal time is also required in case of
independent workflows and you can specify the time in which you expect the
workflow to be run if you don't have a synchronous dataset associated with
it. should-start: Relative to nominal-time this is the amount of time
(along with time-unit - MINUTES, HOURS, DAYS) within which your job
should start running to meet SLA. This is optional. should-end: Relative to nominal-time this is the amount of time
(along with time-unit - MINUTES, HOURS, DAYS) within which your job
should finish to meet SLA. max-duration: This is the maximum amount of
time (along with time-unit - MINUTES, HOURS, DAYS) your job is expected to
run. This is optional. alert-events: Specify the types of events for
which Email alerts should be sent. Allowable values in
this comma-separated list are start_miss, end_miss and duration_miss.
*_met events can generally be deemed low priority and hence email alerting
for these is not necessary. However, note that this setting is only for
alerts via email alerts and not via JMS messages, where
all events send out notifications, and user can filter them using desired
selectors. This is optional and only applicable when alert-contact is
configured. alert-contact: Specify a comma separated list of email addresses where you wish your
alerts to be sent. This is optional and need not be configured if you just want
to view your job SLA history in the UI and do not want to receive email alerts. I'm going to simulate each one of the SLA patterns, i.e. my job started later than scheduled, my job completed outside the SLA threshold and finally, my job took longer to complete than we were expecting. To fill out the nominal time, feel free to choose the date and clock icon below the date picker for correct time. Click x when ready. Finally, I'd like to change my script to run for 120 seconds instead of 60 to simulate long duration. My script should look like so: echo “start of script execution”
sleep 120
echo “end of script execution”
When ready re-upload the script. At this point, I want to make sure sending mail from the cluster is possible and will test that by sending a sample email. Enabling mail is beyond the scope of this tutorial, I followed the procedure below, adjust as necessary for your environment. sudo su
yum install postfix
/etc/init.d/postfix restart
exit Now we're able to send mail from our node, mail needs to work on any of the nodes Oozie will execute a wf. mail -s "test" email@email.com
hit ctrl-D, you should get an email shortly. Finally, there are some changes we need to implement on the Oozie side. I'm not going to enable JMS alerting and only concentrate on email piece. Please consult Oozie docs for JMS part. This is HDP 2.5.3 and things may look/act differently on your Oozie instance. Let's go to Ambari > Configs filter by the following property oozie.services.ext We're going to add these services to the existing list: org.apache.oozie.service.EventHandlerService,
org.apache.oozie.sla.service.SLAService Once ready, add a couple of more custom properties in Oozie, again, in my environment these properties did not exist. oozie.service.EventHandlerService.event.listeners and the value should be org.apache.oozie.sla.listener.SLAJobEventListener,
org.apache.oozie.sla.listener.SLAEmailEventListener
Oozie docs also recommend adding the following property to improve performance of event processing, we're going to add this property and set value to 15. oozie.service.SchedulerService.threads Once I saved the changes and restarted Oozie, it failed to start, looking at the logs I noticed the following in the oozie-error.log: 2017-02-15 18:14:32,757 WARN ConfigUtils:523 - SERVER[wfmanager-test-1.openstacklocal] Using a deprecated configuration property [oozie.service.AuthorizationService.security.enabled], should use [oozie.service.AuthorizationService.authorization.enabled]. Please delete the deprecated property in order for the new property to take effect.
I found the property in Ambari > Configs and set it to false, I was not able to delete it. Once done, restart all Oozie services and now you're able to see a new tab in Oozie called SLA Remember, we only configured Email service, not JMS. We're ready to test our wf. Before that, I'd like to preview the XML for good measure. At this point, I'm ready to submit the workflow and watch my inbox. I'm expecting to miss my job start, job end and duration. This is an email output of my workflow. Until next time folks!
... View more
Labels:
03-03-2017
06:37 AM
@Artem Ervits Unfortunately such a functionality is not currently available
... View more
07-05-2018
06:58 PM
I wanted to check in the HDFS if the file is present, based on the output i wanted to make a decision . How can i do that ?
... View more
02-15-2017
03:26 PM
@Angelo Alexander please refer to the following doc, also you can download the MySQL driver jar from MySQL website and place it in /usr/hdp/current/sqoop-client/lib http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_data-movement-and-integration/content/apache_sqoop_connectors.html
... View more
01-25-2018
10:00 PM
Hi, Thank you for providing these examples. I went through this one but could not make it work. the Job is being Killed for some reason. Looking at the LogError of the workflow, the following is the message I got: USER[admin] GROUP[-] TOKEN[] APP[pigwf] JOB[0000006-180125094008677-oozie-oozi-W] ACTION[0000006-180125094008677-oozie-oozi-W@pig_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2] Also, I looked at /var/log/oozie/oozie-error.log and I got the following message: 2018-01-25 11:20:07,800 WARN ParameterVerifier:523 - SERVER[my.server.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2018-01-25 11:20:07,808 WARN LiteWorkflowAppService:523 - SERVER[my.server.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://my.server.com:8020/tmp/data/lib] does not exist
2018-01-25 11:20:30,527 WARN PigActionExecutor:523 - SERVER[PD-Hortonworks-DATANODE2.network.com] USER[admin] GROUP[-] TOKEN[] APP[pigwf] JOB[0000004-180125094008677-oozie-oozi-W] ACTION[0000004-180125094008677-oozie-oozi-W@pig_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2] I ran the same script as you suggested and also tested it on shell which got me the result I was looking for. Also, I tested the oozie workflow with your part1 tutorial "making a shell command" and it worked. Also, I checked the workflow.xml files and everything looks like yours. Could you please help me find what my problem is? Thanks, Sam
... View more
06-04-2018
12:33 PM
@Artem Ervits How to add the coordinator to the workflow so the script executes every minute for infinite time ?
... View more
02-11-2017
07:53 PM
13 Kudos
This is a second in the series of articles on WFM. Part 1: https://community.hortonworks.com/articles/82964/getting-started-with-apache-ambari-workflow-design.html Part 3: https://community.hortonworks.com/articles/82988/apache-ambari-workflow-designer-view-for-apache-oo-1.html Part 4: https://community.hortonworks.com/articles/83051/apache-ambari-workflow-designer-view-for-apache-oo-2.html Part 5: https://community.hortonworks.com/articles/83361/apache-ambari-workflow-manager-view-for-apache-ooz.html Part 6: https://community.hortonworks.com/articles/83787/apache-ambari-workflow-manager-view-for-apache-ooz-1.html Part 7: https://community.hortonworks.com/articles/84071/apache-ambari-workflow-manager-view-for-apache-ooz-2.html Part 8: https://community.hortonworks.com/articles/84394/apache-ambari-workflow-manager-view-for-apache-ooz-3.html Part 9: https://community.hortonworks.com/articles/85091/apache-ambari-workflow-manager-view-for-apache-ooz-4.html Part 10: https://community.hortonworks.com/articles/85354/apache-ambari-workflow-manager-view-for-apache-ooz-5.html Part 11: https://community.hortonworks.com/articles/85361/apache-ambari-workflow-manager-view-for-apache-ooz-6.html Part 12: https://community.hortonworks.com/articles/131389/apache-ambari-workflow-manager-view-for-apache-ooz-7.html In this tutorial, we're going to import an existing workflow with a Python script wrapped in a shell action. The existing workflows can exist on HDFS or in your local file system. Let's fetch it onto the canvas. My workflow is already on HDFS and therefore that's the option I select. WFM view is integrated with WEBHDFS browser and it makes navigating the directory tree very easy. Navigate to the directory in HDFS with the desired workflow and hit select. Once imported, WFM will run validation on the syntax and present it for further modification. Now you can modify the python-node by hovering over it and clicking the gear icon. Once clicked, you can configure the rest of the action to your liking. it inherits all of the old properties of your workflow. Notice you can specify a directory and script file in the File text box. My Oozie workflow also has old properties like specification of YARN Queue, WFM correctly parses and inherits that property. Also notice I have capture output as I'd like to see the result of the output to the console by my script. At this point, I'm ready to preview my workflow, WFM comes with a handy XML preview. Looks all right to me, I'm ready to submit. Notice WFM doesn't know what $jobTracker is and prompts me to fill that out along with queue. At this point we can navigate to the WFM Dashboard tab as you've seen in my previous tutorial and track the job status. My job failed, I can debug the job status directly from WFM Turns out, issue is with my parameter $jobTracker, in WFM, it was renamed to $resourceManager and it comes by default, I need to remove my custom parameter and let WFM do what it does best. Here's preview of my XML after the change Back in the dashboard, I can click on the job and investigate the status. My job completed successfully, I can navigate to the YARN job status straight from WFM. I need to click on my succeeded wf and click on the arrow icon. It is right there on the right, same row asthe python-node Finally, navigate to the logs of your YARN job to view the output And that's all for this tutorial, you learned how to import an existing Python Oozie workflow and further edit it via WFM. My Python script by the way has the following code #! /usr/bin/env python
import os, pwd, sysprint
"who am I? " + pwd.getpwuid(os.getuid())[0]
print "this is a Python script"
print "Python Interpreter Version: " + sys.version You can find my workflow along with other samples on my github page https://github.com/dbist/oozie Stay tuned!
... View more
Labels:
04-06-2017
08:16 AM
@artem : thanks for tuto Do you know about this error with oozie : "oozie error workflow manager kerberos" ? thanks
... View more
02-09-2017
09:03 PM
@Amit Panda it will help community if you publish the results of your findings with support. It is possible it will serve as reference for other customers having same issue.
... View more
02-02-2017
02:09 PM
Excellent, I will review the language in documentation and issue a pull request to change. Sorry for confusion.
... View more