Created 05-31-2017 12:22 AM
I was using Ambari 2.1 to manage HDP 2.3.0 [AWS Community AMI]
I was able to upgrade to Ambari 2.4.0.1 successfully and all agents were also upgraded and started.
When I check the ambari dashboard, Ambari fails to restart Falcon and Atlas.
I tried to restart a couple times but to no avail.
Here are the services I have on my cluster:
HDFS, YARN, MapReduce2, Tez, Hive, Pig, Oozie, Zookeeper, Falcon, Ambari Metrics, Atlas, Slider
Here are the Atlas Metadata Server restart error log:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/ATLAS/0.1.0.2.3/package/scripts/metadata_server.py", line 217, in <module> MetadataServer().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 685, in restart self.stop(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/ATLAS/0.1.0.2.3/package/scripts/metadata_server.py", line 113, in stop user=params.metadata_user, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 273, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 71, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 93, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 141, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 294, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'source /etc/atlas/conf/atlas-env.sh; /usr/hdp/current/atlas-server/bin/atlas_stop.py' returned 255. -bash: /etc/atlas/conf/atlas-env.sh: No such file or directory Exception: [Errno 17] File exists: '/usr/hdp/2.3.0.0-2557/atlas/conf' Traceback (most recent call last): File "/usr/hdp/current/atlas-server/bin/atlas_stop.py", line 53, in <module> returncode = main() File "/usr/hdp/current/atlas-server/bin/atlas_stop.py", line 28, in main confdir = mc.dirMustExist(mc.confDir(metadata_home)) File "/usr/hdp/2.3.0.0-2557/atlas/bin/atlas_config.py", line 94, in dirMustExist os.mkdir(dirname) OSError: [Errno 17] File exists: '/usr/hdp/2.3.0.0-2557/atlas/conf'
Here are the Falcon Server restart error log:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon_server.py", line 177, in <module> FalconServer().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 685, in restart self.stop(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon_server.py", line 55, in stop falcon('server', action='stop', upgrade_type=upgrade_type) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon.py", line 251, in falcon environment=environment_dictionary) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 273, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 71, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 93, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 141, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 294, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/falcon-server/bin/falcon-stop' returned 1. Hadoop home is set, adding libraries from '/usr/hdp/current/hadoop-client/bin/hadoop classpath' into falcon classpath /usr/hdp/current/falcon-server/bin/service-stop.sh: line 37: kill: (3090) - No such process
Any help/hint is greatly appreciated.
Created 06-01-2017 07:55 PM
Found the answer with the help of this KB Article.
Created 06-01-2017 07:55 PM
Found the answer with the help of this KB Article.