Created 03-30-2017 02:27 PM
Hello community,
One of my devs have executed some oozie workflows with a wrong namenode and now the workflows is frozen.
I have tried to kill it in any possible way, it prompts like it's sucessfully killed, but the workflows still in the console as RUNNING.
[oozie@hadoop01 oozie]$ oozie jobs -kill -filter status=RUNNING the following jobs have been killed Job ID App Name Status User Group Started Ended ------------------------------------------------------------------------------------------------------------------------------------ 0000006-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING batch - 2017-03-29 14:31 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000004-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 14:23 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000003-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 14:21 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000002-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 13:54 GMT - ------------------------------------------------------------------------------------------------------------------------------------ [oozie@hadoop01 oozie]$ oozie jobs -filter status=RUNNING Job ID App Name Status User Group Started Ended ------------------------------------------------------------------------------------------------------------------------------------ 0000006-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING batch - 2017-03-29 14:31 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000004-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 14:23 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000003-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 14:21 GMT - ------------------------------------------------------------------------------------------------------------------------------------ 0000002-170324203356317-oozie-oozi-W BIGDP46B - AppBigRexManClientRUNNING bigdata - 2017-03-29 13:54 GMT - ------------------------------------------------------------------------------------------------------------------------------------ [oozie@LTBIG01 oozie]$
I don't know what is going on, I've tried restarting the server but the problem persist, I also have tried to change the status to KILLED directly in the DB from the tables WF_JOBS and WF_ACTIONS, but it keeps showing it as RUNNING.
I have check the logs and it's clean.
Do you know what maybe going on?
Thank you in advance!
Created 03-31-2017 09:51 PM
@Juan Manuel Nieto, can you please check if a yarn/MR job related to this oozie workflow still running ?
Created 04-03-2017 10:27 AM
Hello @yvora the mr/yarn job is not running, it didn't start
Created 04-03-2017 10:35 PM
What Oozie DB are you using? Also what is the status that you see when you run -info command on one of the workflow that you are trying to kill? e.g -
0000006-170324203356317-oozie-oozi-W
If that is also shown as RUNNING. Have you tried killing just one of the workflow by giving specific workflow ID instead of running using filter? What output does that give?
Created 01-12-2018 02:16 PM
@Juan Manuel Nieto I am not sure if this question has been resolved by this point. However running the following command will kill the specific Oozie job:
oozie job -oozie http://hostname:port/oozie/ -kill jobID
I hope this helps you!