Created on 10-25-2015 03:32 AM - edited 09-16-2022 02:45 AM
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 138, in <module>
OozieServiceCheck().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 61, in service_check
OozieServiceCheckDefault.oozie_smoke_shell_file(smoke_test_file_name, prepare_hdfs_file_name)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 123, in oozie_smoke_shell_file
logoutput=True
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 260, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/var/lib/ambari-agent/tmp/oozieSmoke2.sh redhat /usr/hdp/current/oozie-client /usr/hdp/current/oozie-client/conf /usr/hdp/current/oozie-client/bin http://hdpmaster3.bigdataprod1.wh.xxxxxcorp.com:11000/oozie /usr/hdp/current/oozie-client/doc /usr/hdp/current/hadoop-client/conf /usr/hdp/current/hadoop-client/bin ambari-qa False' returned 1. source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run
Job ID : 0000005-151024200514443-oozie-oozi-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : map-reduce-wf
App Path : hdfs://WHPROD1NN/user/ambari-qa/examples/apps/map-reduce/workflow.xml
Status : FAILED
Run : 0
User : ambari-qa
Group : -
Created : 2015-10-25 03:15 GMT
Started : 2015-10-25 03:15 GMT
Last Modified : 2015-10-25 03:15 GMT
Ended : 2015-10-25 03:15 GMT
CoordAction ID: -
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000005-151024200514443-oozie-oozi-W@:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000005-151024200514443-oozie-oozi-W@mr-node FAILED - - EJ001 -------------
Created 10-28-2015 04:21 AM
The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message
/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node
Created 10-25-2015 11:21 AM
@avoma@hortonwork.com
What version of Ambari? Is kerberos in place in this cluster?
Also, please remove the customer reference from the log files.
Created on 10-26-2015 03:46 AM - edited 08-19-2019 05:55 AM
It is not possible to tell the root cause of the issue by looking at the trace you provided. The log tells that the MR step failed but does not say why. It is possible to get the exact details of the failure though, using the steps below -
1) Click on the job instance in Oozie
2) On the next page, double click on the step that failed.
3) On popup click on the small lens icon button to pull up the log.
4) On the job appilcation page on YARN UI, click on logs and get more information.
In my experience, the error messages are very direct on this page and will tell exactly what the problem is.
Created 10-28-2015 04:21 AM
The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message
/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node
Created 11-26-2015 01:18 PM
You can run smoke test script via command line to see if timeout value causing service check to fail ? (one of the possible cause)
Run below command as ambari-qa user:
source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://localhost:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run