Created 05-13-2016 03:07 PM
HI All,
When I'm trying to deploy the packages using Ambari, It fails out in the last step. Installation starts and in few minutes gives failures encountered.
Please Note : Ambari configuration wizard, no errors noticed. It went well.
Ambari-Server.log File :
at java.lang.Thread.run(Thread.java:722) 13 May 2016 14:53:41,948 WARN [qtp-ambari-agent-117] HeartBeatHandler:224 - Old responseId received - response was lost - returning cached response 13 May 2016 14:53:48,566 WARN [qtp-ambari-agent-117] HeartBeatHandler:224 - Old responseId received - response was lost - returning cached response 13 May 2016 14:53:51,993 WARN [alert-event-bus-1] AlertReceivedListener:259 - Cluster lookup failed for cluster named infotitans 13 May 2016 14:53:51,994 WARN [alert-event-bus-1] AlertReceivedListener:138 - Received an alert for ambari_agent_disk_usage which is a definition that does not exist anymore 13 May 2016 14:55:00,565 WARN [qtp-ambari-agent-124] HeartBeatHandler:629 - Operation failed - may be retried. Service component host: DATANODE, host: node1.cluster.net Action id93-0 13 May 2016 14:55:00,695 INFO [qtp-ambari-agent-124] ServiceComponentHostImpl:1041 - Host role transitioned to a new state, serviceComponentName=DATANODE, hostName=node1.cluster.net, oldState=INSTALLING, currentState=INSTALL_FAILED 13 May 2016 14:55:09,330 INFO [qtp-ambari-agent-125] ServiceComponentHostImpl:1041 - Host role transitioned to a new state, serviceComponentName=APP_TIMELINE_SERVER, hostName=node2.cluster.net, oldState=INSTALLING, currentState=INSTALLED 13 May 2016 14:55:28,676 INFO [qtp-ambari-agent-125] ServiceComponentHostImpl:1041 - Host role transitioned to a new state, serviceComponentName=FLUME_HANDLER, hostName=node1.cluster.net, oldState=INSTALLING, currentState=INSTALLED 13 May 2016 14:55:28,710 INFO [qtp-ambari-agent-125] HeartBeatHandler:696 - Security of service component DATANODE of service HDFS of cluster bigdatapoc has changed from UNSECURED to UNKNOWN on host node1.cluster.net 13 May 2016 14:55:40,954 WARN [qtp-ambari-agent-125] HeartBeatHandler:629 - Operation failed - may be retried. Service component host: DATANODE, host: node2.cluster.net Action id93-0 13 May 2016 14:55:40,972 INFO [qtp-ambari-agent-125] ServiceComponentHostImpl:1041 - Host role transitioned to a new state, serviceComponentName=DATANODE, hostName=node2.cluster.net, oldState=INSTALLING, currentState=INSTALL_FAILED 13 May 2016 14:55:41,381 WARN [ambari-action-scheduler] ActionScheduler:304 - DATANODE failed, request 93 will be aborted 13 May 2016 14:55:41,381 WARN [ambari-action-scheduler] ActionScheduler:317 - Operation completely failed, aborting request id: 93 13 May 2016 14:55:41,382 ERROR [ambari-action-scheduler] ServiceComponentHostImpl:1024 - Can't handle ServiceComponentHostEvent event at current state, serviceComponentName=FLUME_HANDLER, hostName=node1.cluster.net, currentState=INSTALLED, eventType=HOST_SVCCOMP_OP_FAILED, event=EventType: HOST_SVCCOMP_OP_FAILED 13 May 2016
Ambari UI error logs
stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 49, in install self.install_packages(env, params.exclude_packages) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 404, in install_packages Package(name) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 49, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package shell.checked_call(cmd, sudo=True, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 -y install 'hadoop_2_4_*'' returned 1. Error Downloading Packages: fuse-2.8.3-4.0.2.el6.x86_64: failure: getPackage/fuse-2.8.3-4.0.2.el6.x86_64.rpm from ol6_latest: [Errno 256] No more mirrors to try. fuse-libs-2.8.3-4.0.2.el6.x86_64: failure: getPackage/fuse-libs-2.8.3-4.0.2.el6.x86_64.rpm from ol6_latest: [Errno 256] No more mirrors to try. stdout: 2016-05-13 14:53:52,734 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2016-05-13 14:53:52,735 - Group['hadoop'] {} 2016-05-13 14:53:52,736 - Group['users'] {} 2016-05-13 14:53:52,736 - Group['spark'] {} 2016-05-13 14:53:52,736 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': Tr
Created 05-14-2016 12:51 PM
Take a look on this . I recommend to use local repo option and also, make sure that repo file is correct and there is no conflict with repo files in other nodes in case there was previous install in the cluster.
fuse-libs-2.8.3-4.0.2.el6.x86_64: failure: getPackage/fuse-libs-2.8.3-4.0.2.el6.x86_64.rpm from ol6_latest:[Errno256]No more mirrors to try.
Created 05-13-2016 03:28 PM
Is this a reinstall (or upgrade)? If so, please uninstall agent, clear /var/lib/ambari-agent directory and reinstall agent again. Looks to me like old files in cache.
Created 05-14-2016 02:12 AM
Hi Ravi, This is a fresh installation. It throwed this error on the first attempt and then it repeated the same error on all the retries. however, I would try your suggestion and post the results. Thanks for the feedback.
Created 05-14-2016 12:51 PM
Take a look on this . I recommend to use local repo option and also, make sure that repo file is correct and there is no conflict with repo files in other nodes in case there was previous install in the cluster.
fuse-libs-2.8.3-4.0.2.el6.x86_64: failure: getPackage/fuse-libs-2.8.3-4.0.2.el6.x86_64.rpm from ol6_latest:[Errno256]No more mirrors to try.
Created 05-17-2016 10:02 AM
Thanks @neeraj overlooked the yum errors in the logs. I created a local repository with Oracle 6.4 media but that resulted in lot of error as most of the previous packges in the system was updated using the latest Oracle Public Repository
so I have manually installed fuse and all other rpms demanded by Deployement wizard. And then started the deployment. It went well.. Cluster is now up and running.