Reply
New Contributor
Posts: 4
Registered: ‎11-29-2013

JA017: Unknown hadoop job

I recently upgraded. I am getting an error when running a simple Oozie workflow on CDH5 YARN.

 

JA017: Unknown hadoop job [job_1401738543820_0003] associated with action [0000003-140602155011038-oozie-oozi-W@stage]. Failing this action!

 

The jobhistory server is not showing any jobs. Has anyone come across this problem?

Posts: 1,506
Kudos: 255
Solutions: 230
Registered: ‎07-31-2013

Re: JA017: Unknown hadoop job

Do you have further Oozie logs pertaining to the WF instance of 0000003-140602155011038-oozie-oozi-W? They may be needed to help you troubleshoot this issue.
Backline Customer Operations Engineer
New Contributor
Posts: 3
Registered: ‎06-15-2015

Re: JA017: Unknown hadoop job

HI 

I'm also facing the same issue does anybody has any answers fo it. below is the complete job log.

2015-06-14 22:45:38,691 INFO ActionStartXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@:start:] Start action [0000000-150614224523980-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2015-06-14 22:45:38,692 INFO ActionStartXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@:start:] [***0000000-150614224523980-oozie-oozi-W@:start:***]Action status=DONE
2015-06-14 22:45:38,692 INFO ActionStartXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@:start:] [***0000000-150614224523980-oozie-oozi-W@:start:***]Action updated in DB!
2015-06-14 22:45:39,525 INFO ActionStartXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] Start action [0000000-150614224523980-oozie-oozi-W@pig-node] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2015-06-14 22:45:48,366 WARN PigActionExecutor:544 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] Exception in check(). Message[JA017: Unknown hadoop job [job_local1298148369_0001] associated with action [0000000-150614224523980-oozie-oozi-W@pig-node]. Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Unknown hadoop job [job_local1298148369_0001] associated with action [0000000-150614224523980-oozie-oozi-W@pig-node]. Failing this action!
at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1200)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1137)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
2015-06-14 22:45:48,375 WARN ActionStartXCommand:544 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] Error starting action [pig-node]. ErrorType [FAILED], ErrorCode [JA017], Message [JA017: Unknown hadoop job [job_local1298148369_0001] associated with action [0000000-150614224523980-oozie-oozi-W@pig-node]. Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Unknown hadoop job [job_local1298148369_0001] associated with action [0000000-150614224523980-oozie-oozi-W@pig-node]. Failing this action!
at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1200)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1137)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
2015-06-14 22:45:48,379 WARN ActionStartXCommand:544 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] Failing Job due to failed action [pig-node]
2015-06-14 22:45:48,389 WARN LiteWorkflowInstance:544 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] Workflow Failed. Failing node [pig-node]
2015-06-14 22:45:49,088 INFO KillXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[-] STARTED WorkflowKillXCommand for jobId=0000000-150614224523980-oozie-oozi-W
2015-06-14 22:45:49,251 INFO KillXCommand:541 - SERVER[localhost] USER[training] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[-] ENDED WorkflowKillXCommand for jobId=0000000-150614224523980-oozie-oozi-W
2015-06-14 22:45:49,332 INFO CallbackServlet:541 - SERVER[localhost] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] callback for action [0000000-150614224523980-oozie-oozi-W@pig-node]
2015-06-14 22:45:49,471 ERROR CompletedActionXCommand:538 - SERVER[localhost] USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000000-150614224523980-oozie-oozi-W] ACTION[0000000-150614224523980-oozie-oozi-W@pig-node] XException,
org.apache.oozie.command.CommandException: E0800: Action it is not running its in [FAILED] state, action [0000000-150614224523980-oozie-oozi-W@pig-node]
at org.apache.oozie.command.wf.CompletedActionXCommand.eagerVerifyPrecondition(CompletedActionXCommand.java:77)
at org.apache.oozie.command.XCommand.call(XCommand.java:251)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 

Let me know if any other details are required. Using cdh 5.2.0
at java.lang.Thread.run(Thread.java:722)

Cloudera Employee
Posts: 241
Registered: ‎01-16-2014

Re: JA017: Unknown hadoop job

Check if the job logs are available in the job history server directly. If they are check the following:

Check that oozie knows where to find the JHS. Check yarn.log.server.url and if not set you can set it.

Another possibility is wrong permissions on the /tmp/logs directory on HDFS. See the documentation for the proper settings.

 

Wilfred

New Contributor
Posts: 3
Registered: ‎06-15-2015

Re: JA017: Unknown hadoop job

thanks for the reply.

 

I'm a bit new to hadoop so where can i find the job history server logs. i'm looking at http://localhost:50030/jobhistory.jsp   but i don't see any logs related to the pig jobs.

in which conifg file do i need to check that oozie can find the JHS. Below is what i'm getting when running sudo jps.

3096
20352 Jps
14207 TaskTracker
3461 AlertPublisher
1981 QuorumPeerMain
2469 RunJar
2586 RunJar
14065 JobTracker
5994 EventCatcherService
14374 DataNode
14565 NameNode
5881 Main
5880 Main
14714 SecondaryNameNode
4633 Bootstrap
3063

 

Am, i missing something. I can't see any folder on hdfs under /tmp as logs.

i can't see any yarn-site.xml in my hadoop/conf folder. only mapred-site.xml is there with the below conf.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
<property>
<name> mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
</property>

<!-- Enable Hue plugins -->
<property>
<name>mapred.jobtracker.plugins</name>
<value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
<description>Comma-separated list of jobtracker plug-ins to be activated.
</description>
</property>
<property>
<name>jobtracker.thrift.address</name>
<value>0.0.0.0:9290</value>
</property>

 

can you send me the link for the documentation.

hadoop version is 2.5.0 and cdh ver is 5.2.0
</configuration>

Explorer
Posts: 16
Registered: ‎01-21-2015

Re: JA017: Unknown hadoop job

Hi ,

 

I am facing the same issue in our staging and development . My jobs was running  fine from oozie and yestrday onwards it started failing with the below error. Please let me know how can i fix this

 

 

Could not find job job_1443421033071_0003.

Job job_1443421033071_0003 could not be found: {"RemoteException":{"exception":"NotFoundException","message":"java.lang.Exception: job, job_1443421033071_0003, is not found","javaClassName":"org.apache.hadoop.yarn.webapp.NotFoundException"}} (error 404)

 

 

 

Oozie log :-

OB[0000002-150928101824518-oozie-oozi-W] ACTION[0000002-150928101824518-oozie-oozi-W@GenerateParameters] Exception while executing check(). Error Code [JA017], Message[JA017: Unknown hadoop job [job_1443421033071_0003] associated with action [0000002-150928101824518-oozie-oozi-W@GenerateParameters]. Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Unknown hadoop job [job_1443421033071_0003] associated with action [0000002-150928101824518-oozie-oozi-W@GenerateParameters]. Failing this action!
at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1210)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:181)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:55)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-09-28 10:20:45,070 WARN org.apache.oozie.command.wf.ActionCheckXCommand: SERVER[xxxx.cmpny.com] USER[cdh_user] GROUP[-] TOKEN[] APP[ImportBookingData] JOB[0000002-150928101824518-oozie-oozi-W] ACTION[0000002-150928101824518-oozie-oozi-W@GenerateParameters] Failing Job due to failed action [GenerateParameters]
2015-09-28 10:20:45,077 WARN org.apache.oozie.workflow.lite.LiteWorkflowInstance: SERVER[xxxx.cmpny.com] USER[cdh_user] GROUP[-] TOKEN[] APP[ImportBookingData] JOB[0000002-150928101824518-oozie-oozi-W] ACTION[0000002-150928101824518-oozie-oozi-W@GenerateParameters] Workflow Failed. Failing node [GenerateParameters]
2015-09-28 10:20:49,884 INFO org.apache.oozie.command.wf.KillXCommand: SERVER[xxxx.cmpny.com] USER[cdh_user] GROUP[-] TOKEN[] APP[ImportBookingData] JOB[0000002-150928101824518-oozie-oozi-W] ACTION[-] STARTED WorkflowKillXCommand for jobId=0000002-150928101824518-oozie-oozi-W
2015-09-28 10:20:52,438 INFO org.apache.oozie.command.wf.KillXCommand: SERVER[xxxx.cmpny.com] USER[cdh_user] GROUP[-] TOKEN[] APP[ImportBookingData] JOB[0000002-150928101824518-oozie-oozi-W] ACTION[-] ENDED WorkflowKillXCommand for jobId=0000002-150928101824518-oozie-oozi-W

 

Thanks,

Rathish 

 

Cloudera Employee
Posts: 241
Registered: ‎01-16-2014

Re: JA017: Unknown hadoop job

Can you make sure the JHS is up and running and when you look for the job in the UI that it can find it?

You can also try the command line to see if it is on HDFS?

  yarn logs -applicationId APP_ID -appOwner USER_ID

The APP_ID is the same as the job ID that you showed but with job replaced by application

the USER_ID is the ID of user that ran the job

 

Wilfred

Explorer
Posts: 16
Registered: ‎01-21-2015

Re: JA017: Unknown hadoop job

[ Edited ]

Hi Wilfred,

 

Thanks for the quick response.

 

Yes. My history server is up and running .

I can see the Yarn logs for the application ID 

 

sudo -u hdfs hadoop fs -ls /tmp/logs/xxxx/logs/application_1443421033071_0057
Found 2 items
-rw-r----- 2 xxxx hadoop 117553 2015-09-28 11:40 /tmp/logs/xxxx/logs/application_1443421033071_0057/xxxx06.cmpny.com_8041
-rw-r----- 2 xxxx hadoop 30524 2015-09-28 11:40 /tmp/logs/xxxx/logs/application_1443421033071_0057/xxx04.cmpny.com_8041

 

But this job is not listed in Job History server URL 

 

http://xxxx:19888/jobhistory/job/job_1443421033071_0057
Not Found: job_1443421033071_0057

 

Is there any permission issue ?

 

sudo -u hdfs hadoop fs -ls /tmp/logs/xxxx/logs | grep application_1443421033071_0057
drwxrwx--- - xxxx hadoop 0 2015-09-28 11:40 /tmp/logs/xxxx/logs/application_1443421033071_0057

sudo -u hdfs hadoop fs -ls /tmp/logs/xxxx/
drwxrwx--- - xxxx hadoop 0 2015-09-28 12:05 /tmp/logs/xxxx/logs

sudo -u hdfs hadoop fs -ls /tmp/logs/
drwxrwxrwx - xxxx hadoop 0 2015-09-26 04:46 /tmp/logs/xxxx

 

 

# hitsory - intermediate dir permissons

 

sudo -u hdfs hadoop fs -ls /user/history/done_intermediate/xxxx/ | grep 1443421033071_0057
-rwxrwx--- 2 xxxx hadoop 21245 2015-09-28 11:40 /user/history/done_intermediate/xxxx/job_1443421033071_0057-1443426017794-xxxx-oozie%3Alauncher%3AT%3Dshell%3AW%3XXXDataIngestion%3AA%3DMo-1443426048390-1-0-SUCCEEDED-root.xxxx-1443426028405.jhist
-rwxrwx--- 2 xxxx hadoop 431 2015-09-28 11:40 /user/history/done_intermediate/xxxx/job_1443421033071_0057.summary
-rwxrwx--- 2 xxxx hadoop 111710 2015-09-28 11:40 /user/history/done_intermediate/xxxx/job_1443421033071_0057_conf.xml

 

 

Thanks,

Rathish A M

 

Cloudera Employee
Posts: 241
Registered: ‎01-16-2014

Re: JA017: Unknown hadoop job

Files should be moved from done_intermediate to the done directory during the normal running of the JHS.

Two things to check:

- does the JHS show any errors in the logs?

- run the following command on the host that runs the JHS: id -Gn mapred

  it should show as an output: "mapred hadoop"

 

That is assuming that the JHS runs as the mapred user if it runs as another user replace the mapred in the id command.

 

Wilfred

Explorer
Posts: 16
Registered: ‎01-21-2015

Re: JA017: Unknown hadoop job

Hi Wilfred,

 

Thanks for the support.

We were able to fix the issue. The issue was related to permission for history and tmp folders.

 

 

Thanks,

Rathish A M

 

Announcements