Support Questions

Find answers, ask questions, and share your expertise

Oozie Workflow: Get running action name

avatar
New Contributor

Hi

 

I am trying to fetch the name of the running action in a oozie workflow. I know we can get the workflow level details by using a REST Request as given here.

 

But i did not have any luck in getting more detailed drilled down info of a workflow. 

 

Can someone please help me out on this ?

 

1 ACCEPTED SOLUTION

avatar
Mentor
Your end point is incorrect - you're trying /jobs/ (which gives a list of WFs with high-level info) and not /job/WFID (gives a specific WF and all details). The latter is what you need.

Do this:

req = urllib2.Request('http://xx.xx.xxx.xx:11000/oozie/v1/job/0000096-151104073848042-oozie-hado-W')

(Or use /jobs to iterate over the list of all WFs, calling /job/ID for each item's id field)

View solution in original post

5 REPLIES 5

avatar
Mentor
What API end point have you tried, specifically? It appears you are looking for this one http://archive.cloudera.com/cdh5/cdh/5/oozie/WebServicesAPI.html#Job_Information

avatar
New Contributor

Thanks Harsh. 

I have been using the same Job Information as you mentioned. I guess you are referring to the actions in the JSON returned. But actions is coming as empty for all workflows.

Do i need to do something extra to get action level information ?

Edit: By actions i mean the individual actions that we define in the workflow XML. I am using a normal job.properties file as configuration and not a coordinator.

In other words it can mean the access to Job DAG.

avatar
Mentor
Could you share a sample request URL and output received?

It seems to work OK for me, for ex. for my WF ID of "0000000-151116211358117-oozie-oozi-W":

~> curl -L 'http://localhost:11000/oozie/v2/job/0000000-151116211358117-oozie-oozi-W' > wf.json
~> python
>>> import json
>>> a = json.loads(open('wf.json').read())
>>> len(a['actions'])
2
>>> a['actions'][1]['name']
u'Shell'

FWIW, Hue today uses the same API for its Oozie app dashboards, and it does fetch all actions properly too.

How old is the targeted WF, and are you able to see the list of actions OK in the web UIs?

avatar
New Contributor

Here is the sample python code i am using

 

req = urllib2.Request('http://xx.xx.xxx.xx:11000/oozie/v1/jobs?show=info')
response = urllib2.urlopen(req)
output = response.read()
output1 = json.loads(output)
print output1

 Output:

{"status": "RUNNING", "run": 0, "startTime": "Mon, 16 Nov 2015 14:40:29 GMT", "appName": "WrkflowGeneratorDemo", "lastModTime": "Mon, 16 Nov 2015 14:43:08 GMT", "actions": [], "acl": null, "appPath": null, "externalId": null, "consoleUrl": "http://ip-xx-xxx-xx-xx:11000/oozie?job=0000096-151104073848042-oozie-hado-W", "conf": null, "parentId": null, "createdTime": "Mon, 16 Nov 2015 14:40:29 GMT", "toString": "Workflow id[0000096-151104073848042-oozie-hado-W] status[RUNNING]", "endTime": null, "id": "0000096-151104073848042-oozie-hado-W", "group": null, "user": "hadoop"}

 

In the req, i have tried specifying the len, offset, jobtype and type. But it always gives the same output

avatar
Mentor
Your end point is incorrect - you're trying /jobs/ (which gives a list of WFs with high-level info) and not /job/WFID (gives a specific WF and all details). The latter is what you need.

Do this:

req = urllib2.Request('http://xx.xx.xxx.xx:11000/oozie/v1/job/0000096-151104073848042-oozie-hado-W')

(Or use /jobs to iterate over the list of all WFs, calling /job/ID for each item's id field)