Created 10-31-2016 01:09 PM
Hi Team,
We have set up a 3 node HDP 2.5 cluster using CentOS 6.5. When we are trying to add the Hive service from Ambari in that cluster we are receiving connection refused error (error happening in all the 3 nodes). However, we have successfully added other services without any issues. Please note that while adding service we are selecting New MySQL database, there is no existing MySQL database in that cluster. Need your help to address the issue. Error stack given below which came under "Hive Client Install" while adding the Hive service.
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_client.py", line 68, in <module> HiveClient().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_client.py", line 35, in install self.configure(env) File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_client.py", line 43, in configure hive(name='client') File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive.py", line 282, in hive mode = 0644, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 123, in action_create content = self._get_content() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 160, in _get_content return content() File "/usr/lib/python2.6/site-packages/resource_management/core/source.py", line 51, in __call__ return self.get_content() File "/usr/lib/python2.6/site-packages/resource_management/core/source.py", line 193, in get_content web_file = opener.open(req) File "/usr/lib64/python2.6/urllib2.py", line 391, in open response = self._open(req, data) File "/usr/lib64/python2.6/urllib2.py", line 409, in _open '_open', req) File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain result = func(*args) File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open raise URLError(err) urllib2.URLError: <urlopen error [Errno 111] Connection refused>
Created 10-31-2016 02:53 PM
It should download DBConnectionVerification.jar from Ambari server. I just double checked. Not sure why it is picking up different host instead of Ambari server. Can you please check your /etc/hosts file just to see if there is any conflicting entry.
This problem is happening while installing hive-client on your system.
Can you please make sure to have DBConnectionVerification.jar on all the ambari-agents?
/usr/lib/ambari-agent/DBConnectionVerification.jar /var/lib/ambari-agent/tmp/DBConnectionVerification.jar
Typically, on Ambari server, below is the location for DBConnectionVerification.jar
/var/lib/ambari-server/resources/DBConnectionVerification.jar
Created 10-31-2016 01:13 PM
Please note that all the pre-requisites like SSH setup, THP disable, SELinux disable, iptables disable have been done in all the nodes in the cluster.
Created 10-31-2016 01:16 PM
Hi,
Please check ambari server log file and also verify the ports are open for MYSQL 3306 on that host.
Created 10-31-2016 01:24 PM
Hi,
Ambari server log snippet below and 3306 port is open.
31 Oct 2016 18:52:09,307 INFO [ambari-client-thread-23] AbstractProviderModule:953 - Current Metrics collector Host : null 31 Oct 2016 18:52:09,308 INFO [ambari-client-thread-23] AbstractProviderModule:958 - New Metrics collector Host : nn.tcsgegdc.com 31 Oct 2016 18:52:10,792 INFO [ambari-heartbeat-processor-0] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=HCAT, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALLED 31 Oct 2016 18:52:13,741 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: HIVE_CLIENT, host: dn2.tcsgegdc.com Action id 83-0 and Task id 384 31 Oct 2016 18:52:13,753 INFO [ambari-heartbeat-processor-0] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=HIVE_CLIENT, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALL_FAILED 31 Oct 2016 18:52:13,796 INFO [ambari-heartbeat-processor-0] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=HIVE_METASTORE, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALLED 31 Oct 2016 18:52:14,593 WARN [ambari-action-scheduler] ActionScheduler:415 - HIVE_CLIENT failed, request 83 will be aborted 31 Oct 2016 18:52:14,593 ERROR [ambari-action-scheduler] ActionScheduler:428 - Operation completely failed, aborting request id: 83 31 Oct 2016 18:52:14,606 INFO [ambari-action-scheduler] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=HIVE_SERVER, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALL_FAILED 31 Oct 2016 18:52:14,614 INFO [ambari-action-scheduler] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=MYSQL_SERVER, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALL_FAILED 31 Oct 2016 18:52:14,623 INFO [ambari-action-scheduler] ServiceComponentHostImpl:1067 - Host role transitioned to a new state, serviceComponentName=WEBHCAT_SERVER, hostName=dn2.tcsgegdc.com, oldState=INSTALLING, currentState=INSTALL_FAILED 31 Oct 2016 18:52:14,632 INFO [ambari-action-scheduler] ActionDBAccessorImpl:218 - Aborting command. Hostname dn2.tcsgegdc.com role HIVE_SERVER requestId null taskId 386 stageId null 31 Oct 2016 18:52:14,632 INFO [ambari-action-scheduler] ActionDBAccessorImpl:218 - Aborting command. Hostname dn2.tcsgegdc.com role MYSQL_SERVER requestId null taskId 387 stageId null 31 Oct 2016 18:52:14,632 INFO [ambari-action-scheduler] ActionDBAccessorImpl:218 - Aborting command. Hostname dn2.tcsgegdc.com role WEBHCAT_SERVER requestId null taskId 388 stageId null 31 Oct 2016 18:52:14,747 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:626 - Security of service component HCAT of service HIVE of cluster TCSGEINTERNALCLUSTER has changed from UNSECURED to UNKNOWN on host dn2.tcsgegdc.com
Created 10-31-2016 01:32 PM
adding the snapshot of ambari-alerts.log
Created 10-31-2016 01:50 PM
On which host you have installed MySQL server? Can you please double check if MySql is up and running? Can you also try to login to mysql shell from any other node in the cluster and see if its working?
Created 10-31-2016 01:58 PM
Hi @Kuldeep Kulkarni.....we are selecting new MySQL instance while adding the Hive service so MySQL is not pre-configured. We have deleted the service from Ambari as Hive services are having issues.
Created 10-31-2016 02:01 PM
@rajdip chaudhuri - okay. can you please share more logs when you get the error i.e. stdout and stderr from Ambari UI.
Created 10-31-2016 02:07 PM
@Kuldeep Kulkarni - below are logs from Ambari UI.
Created 10-31-2016 02:11 PM
@rajdip chaudhuri - Okay below command is failing:
Downloading the file from http://dn1.tcsgegdc.com:8080/resources/DBConnectionVerification.jar
Can you please make sure that you are able to connect to Ambari server from problematic host where above command is failing? you can try simple telnet to check the connectivity on port 8080