Support Questions

Find answers, ask questions, and share your expertise

check hive failed when using Ambari installation tool: port is not listening

avatar
Contributor

Hi,

I'm trying to install Hive on a remote serve using Ambari installation tool. I'm using Ambari 2.2 and installed HDP 2.3.0 with the following components:

  • kafka
  • hive
  • hadoop
  • ambari metrics
  • trez
  • zookeeper
  • hbase

I got the following error with check Hive:

stderr: /var/lib/ambari-agent/data/errors-64.txt
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/service_check.py", line 106, in <module>
    HiveServiceCheck().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/service_check.py", line 97, in service_check
    (params.hostname, params.hive_server_port, elapsed_time))
resource_management.core.exceptions.Fail: Connection to Hive server spark01.nor1solutions.com on port 10000 failed after 295 seconds										stderr:   /var/lib/ambari-agent/data/errors-64.txt





All ports on the remote server are open. When I use 

netstat -tupln | grep -i listen | grep -i 10000

There was no process listening on that port. I've retried the installation and the same error happened again. 

Can anyone help on how to fix it?

Thanks,
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Jade Liu can you run tracert to spark01 server, if its an option also disable firewall and try again. It's hard to tell but you may be denying that server.

View solution in original post

10 REPLIES 10

avatar

Try just, netstat -tupln | grep -i 10000

Any firewall rules?

avatar
Contributor

Thanks for your reply!

netstat -tupln | grep -i 10000

did not get any result.

To check firewall settings:

run iptables -L, get the following result:

Chain INPUT (policy ACCEPT)

target prot opt source destination

ACCEPT all -- anywhere anywhere

DROP icmp -- anywhere anywhere icmp timestamp-request

DROP icmp -- anywhere anywhere icmp timestamp-reply

DROP icmp -- anywhere anywhere icmp address-mask-request

ACCEPT icmp -- anywhere anywhere icmp any

ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED

ACCEPT all -- anywhere anywhere

ACCEPT tcp -- 10.40.19.4 anywhere tcp dpt:5666

ACCEPT tcp -- dbbkup02.nor1solutions.com anywhere tcp dpt:5666

ACCEPT tcp -- 32.c2.9bc0.ip4.static.sl-reverse.com anywhere tcp dpt:5666

ACCEPT tcp -- nagios-dev.nor1sc.net anywhere tcp dpt:5666

ACCEPT udp -- 10.40.19.4 anywhere udp dpt:snmp

ACCEPT udp -- dbbkup02.nor1solutions.com anywhere udp dpt:snmp

ACCEPT udp -- 32.c2.9bc0.ip4.static.sl-reverse.com anywhere udp dpt:snmp

ACCEPT tcp -- 10.0.0.0/8 anywhere state NEW tcp dpt:ssh

ACCEPT tcp -- 209.119.28.98 anywhere state NEW tcp dpt:ssh

DROP all -- anywhere anywhere

I'm using centos6.7.

avatar
Master Mentor

@Jade Liu can you run tracert to spark01 server, if its an option also disable firewall and try again. It's hard to tell but you may be denying that server.

avatar
Master Mentor

is fw down on both nodes?

avatar
Contributor

Thanks for replying! Here is the content inside /var/lib/ambari-agent/data/errors-64.txt:

Traceback (most recent call last):

File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/service_check.py", line 106, in <module>

HiveServiceCheck().execute()

File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute

method(env)

File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/service_check.py", line 97, in service_check

(params.hostname, params.hive_server_port, elapsed_time))

resource_management.core.exceptions.Fail: Connection to Hive server spark01.nor1solutions.com on port 10000 failed after 295 seconds

avatar
Contributor

Thanks @Artem Ervits for your help!

run traceroute -p 10000 spark01.nor1solutions.com

get the following result: traceroute to spark01.nor1solutions.com (10.86.36.14), 64 hops max, 52 byte packets

the routes look normal.

Then tried to disable the firewall using service iptables stop on the server and seems like Hive now is successfully started. But now when I try to run any hive question, I still getting the following error:

Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: Connections could not be acquired from the underlying database! Error Code: 0

H020 Could not establish connecton to spark01.nor1solutions.com:10000: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused

Is there any other thing blocking the connection to port 10000?

avatar
Master Mentor

@Jade Liu Try to restart Hive with fw down. You need to make sure every endpoint allows every other node

avatar
Contributor

Thank you @Artem Ervits! If I cannot turn off fw, is there any other option to fix the connection problem?

avatar
Master Mentor

@Jade Liu you need to add the server to firewall policy. You can allow all traffic to that server in firewall policy or go through each Port and add them.