Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to run Oozie Job with Python Script in Sandbox?

avatar
1 ACCEPTED SOLUTION

avatar
Master Mentor

I am mobile and can't comment on your workflows right now but I have example of python2 and python3 WF in my repo https://github.com/dbist/oozie

Browse to oozie/apps/ and you will see their respective directories. Use as you wish.

View solution in original post

6 REPLIES 6

avatar
Master Mentor

I am mobile and can't comment on your workflows right now but I have example of python2 and python3 WF in my repo https://github.com/dbist/oozie

Browse to oozie/apps/ and you will see their respective directories. Use as you wish.

avatar

Thank you for your prompt reply. But I have a very basic question. Python should be install in the Sandbox? apps-directory.png Currently my python scripts are placed in /root/examples/apps/map-reduce/ But I am guessing there should be a python folder in root/examples/apps/Python? which contains the job.properties and workflow.xml files in addition to the lib folder would be great if you could guide in which directory should I place the python script files

avatar
Master Mentor

@justlearning same version of Python needs to be installed on every node that will run oozie containers (nodemanager). Same goes for any Python libraries you're importing into your script. I usually create the following tree

admin@u1201:~/oozie/apps/python$ tree
.
|-- job.properties
|-- scripts
|   `-- script.py
`-- workflow.xml


1 directory, 3 files

so what you want is a workflow directory on hdfs with at least workflow.xml and optionally another directory within it with a Python script. job.properties file needs to be on your local filesystem. Then you would execute the oozie wf the following way:

oozie job -oozie http://u1203.ambari.apache.org:11000/oozie -config oozie/apps/python/job.properties -run

avatar

@Artem Ervits ohh thank you so much .. How do you determine where the output of the job should be store/ how can you see the output to be sure it was what you're looking for

avatar
Master Mentor

avatar
Master Mentor

It's a good practice to accept answer if it satisfies your needs.