Support Questions

Find answers, ask questions, and share your expertise

OOzie smoke tests failed in Ambari UI

avatar
Expert Contributor
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 138, in <module>
    OozieServiceCheck().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 61, in service_check
    OozieServiceCheckDefault.oozie_smoke_shell_file(smoke_test_file_name, prepare_hdfs_file_name)
  File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/service_check.py", line 123, in oozie_smoke_shell_file
    logoutput=True
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 260, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
    result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
    tries=tries, try_sleep=try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
    raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/var/lib/ambari-agent/tmp/oozieSmoke2.sh redhat /usr/hdp/current/oozie-client /usr/hdp/current/oozie-client/conf /usr/hdp/current/oozie-client/bin http://hdpmaster3.bigdataprod1.wh.xxxxxcorp.com:11000/oozie /usr/hdp/current/oozie-client/doc /usr/hdp/current/hadoop-client/conf /usr/hdp/current/hadoop-client/bin ambari-qa False' returned 1. source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run
Job ID : 0000005-151024200514443-oozie-oozi-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : map-reduce-wf
App Path      : hdfs://WHPROD1NN/user/ambari-qa/examples/apps/map-reduce/workflow.xml
Status        : FAILED
Run           : 0
User          : ambari-qa
Group         : -
Created       : 2015-10-25 03:15 GMT
Started       : 2015-10-25 03:15 GMT
Last Modified : 2015-10-25 03:15 GMT
Ended         : 2015-10-25 03:15 GMT
CoordAction ID: -

Actions
------------------------------------------------------------------------------------------------------------------------------------
ID                                                                            Status    Ext ID                 Ext Status Err Code  
------------------------------------------------------------------------------------------------------------------------------------
0000005-151024200514443-oozie-oozi-W@:start:                                  OK        -                      OK         -         
------------------------------------------------------------------------------------------------------------------------------------
0000005-151024200514443-oozie-oozi-W@mr-node                                  FAILED    -                      -          EJ001      

-------------

1 ACCEPTED SOLUTION

avatar
Contributor

The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message

/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node  

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@avoma@hortonwork.com

What version of Ambari? Is kerberos in place in this cluster?

Also, please remove the customer reference from the log files.

avatar

It is not possible to tell the root cause of the issue by looking at the trace you provided. The log tells that the MR step failed but does not say why. It is possible to get the exact details of the failure though, using the steps below -

1) Click on the job instance in Oozie

2) On the next page, double click on the step that failed.

3) On popup click on the small lens icon button to pull up the log.

332-screen-shot-2015-10-25-at-113913-pm.png

4) On the job appilcation page on YARN UI, click on logs and get more information.

333-screen-shot-2015-10-25-at-114439-pm.png

In my experience, the error messages are very direct on this page and will tell exactly what the problem is.

avatar
Contributor

The error code suggests that the oozie sharelib may not be set correctly. Was this an upgraded cluster? Also you can try following command to see if you get more info/error message

/usr/hdp/current/oozie-client/bin/oozie -Doozie.auth.token.cache=false job -oozie http://hdpmaster3.bigdataprod1.wh.xxxxcorp.com:11000/oozie -info 0000005-151024200514443-oozie-oozi-W@mr-node  

avatar
Master Guru

You can run smoke test script via command line to see if timeout value causing service check to fail ? (one of the possible cause)

Run below command as ambari-qa user:

source /usr/hdp/current/oozie-client/conf/oozie-env.sh ; /usr/hdp/current/oozie-client/bin/oozie
-Doozie.auth.token.cache=false job -oozie http://localhost:11000/oozie -config /usr/hdp/current/oozie-client/doc/examples/apps/map-reduce/job.properties -run