Created 04-01-2016 01:26 PM
I am using Oozie sqoop action and I want to run 100 same tables everyday hive-import from Netezza to hive. my query runs fine with Oozie workflow Sqoop action. I using hive-overwrite table every time.
(1)my question is if I can specify 100 action in one workflow.xml and failed after 10 action so when I want to rerun the workflow on specific failed node how to do?
(2) if I can specify 1 sqoop action in one workflow so may be i jave 100 workflow for each action which is the best approach.
if we want to rerun oozie job specific failed node can you please give me sample workflow.xml looks like? I really appreciated
Created 04-01-2016 02:59 PM
@mike pal - Good question!
For case number 1
1. Find out the WF id of the failed/killed job.
2. Prepare a job config file which needs to be passed to the rerun command. To do so Follow below steps:
2.1 You first need the oozie job configuration xml file. The easiest way to do that is to use the -configcontent option of the oozie job command. E.g. On commandline
export OOZIE_URL="http://<oozie-host>:11000/oozie" oozie job -configcontent <workflow-id> > job_conf.xml
2.2 Delete oozie.coord.application.path property from job_conf.xml. This is to avoid E0301: Invalid resource oozie rerun error.
2.3 Now add below property in job_conf.xml. This determines what actions need to be run in the workflow. If we specify specific action nodes here then it will skip those actions. if nothing specified then it will run all actions of the workflow.
To run all actions of a workflow:
<property> <name>oozie.wf.rerun.skip.nodes</name> <value>,</value> </property>
To skip few actions of a workflow ( all the action nodes specified here will be skipped and the rest will be run 😞
<property> <name>oozie.wf.rerun.skip.nodes</name> <value>action-name1,action-name2,etc.</value> </property>
3. Re-run wf with below command
oozie job -config "job_conf.xml" -rerun <wf-id>
Created 04-01-2016 02:59 PM
@mike pal - Good question!
For case number 1
1. Find out the WF id of the failed/killed job.
2. Prepare a job config file which needs to be passed to the rerun command. To do so Follow below steps:
2.1 You first need the oozie job configuration xml file. The easiest way to do that is to use the -configcontent option of the oozie job command. E.g. On commandline
export OOZIE_URL="http://<oozie-host>:11000/oozie" oozie job -configcontent <workflow-id> > job_conf.xml
2.2 Delete oozie.coord.application.path property from job_conf.xml. This is to avoid E0301: Invalid resource oozie rerun error.
2.3 Now add below property in job_conf.xml. This determines what actions need to be run in the workflow. If we specify specific action nodes here then it will skip those actions. if nothing specified then it will run all actions of the workflow.
To run all actions of a workflow:
<property> <name>oozie.wf.rerun.skip.nodes</name> <value>,</value> </property>
To skip few actions of a workflow ( all the action nodes specified here will be skipped and the rest will be run 😞
<property> <name>oozie.wf.rerun.skip.nodes</name> <value>action-name1,action-name2,etc.</value> </property>
3. Re-run wf with below command
oozie job -config "job_conf.xml" -rerun <wf-id>
Created 04-01-2016 03:56 PM
Thanks you so much !
Created 08-18-2016 06:09 AM