Created on 10-25-2015 03:32 AM - edited 09-16-2022 02:45 AM
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 138, in <module> OozieServiceCheck().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 61, in service_check OozieServiceCheckDefault.oozie_smoke_shell_file(smoke_test_file_name, prepare_hdfs_file_name) File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 123, in oozie_smoke_shell_file logoutput=True File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 260, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/var/lib/ambari-agent/tmp/oozieSmoke2.sh redhat /usr/hdp/current/oozie-client /usr/hdp/current/oozie-client/conf /usr/hdp/current/oozie-client/bin http://hdpmaster3.bigdataprod1.wh.xxxxxcorp.com:11000/oozie /usr/hdp/current/oozie-client/doc /usr/hdp/current/hadoop-client/conf /usr/hdp/current/hadoop-client/bin ambari-qa False' returned 1. source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run Job ID : 0000005-151024200514443-oozie-oozi-W ------------------------------------------------------------------------------------------------------------------------------------ Workflow Name : map-reduce-wf App Path : hdfs://WHPROD1NN/user/ambari-qa/examples/apps/map-reduce/workflow.xml Status : FAILED Run : 0 User : ambari-qa Group : - Created : 2015-10-25 03:15 GMT Started : 2015-10-25 03:15 GMT Last Modified : 2015-10-25 03:15 GMT Ended : 2015-10-25 03:15 GMT CoordAction ID: - Actions ------------------------------------------------------------------------------------------------------------------------------------ ID Status Ext ID Ext Status Err Code ------------------------------------------------------------------------------------------------------------------------------------ 0000005-151024200514443-oozie-oozi-W@:start: OK - OK - ------------------------------------------------------------------------------------------------------------------------------------ 0000005-151024200514443-oozie-oozi-W@mr-node FAILED - - EJ001
-------------
Created 10-28-2015 04:21 AM
The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message
/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node
Created 10-25-2015 11:21 AM
@avoma@hortonwork.com
What version of Ambari? Is kerberos in place in this cluster?
Also, please remove the customer reference from the log files.
Created on 10-26-2015 03:46 AM - edited 08-19-2019 05:55 AM
It is not possible to tell the root cause of the issue by looking at the trace you provided. The log tells that the MR step failed but does not say why. It is possible to get the exact details of the failure though, using the steps below -
1) Click on the job instance in Oozie
2) On the next page, double click on the step that failed.
3) On popup click on the small lens icon button to pull up the log.
4) On the job appilcation page on YARN UI, click on logs and get more information.
In my experience, the error messages are very direct on this page and will tell exactly what the problem is.
Created 10-28-2015 04:21 AM
The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message
/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node
Created 11-26-2015 01:18 PM
You can run smoke test script via command line to see if timeout value causing service check to fail ? (one of the possible cause)
Run below command as ambari-qa user:
source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://localhost:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run