Support Questions

Find answers, ask questions, and share your expertise

Connection failed: [Errno 111] Connection refused to ip-ec2-private.ap-south-1.compute.internal:16000

avatar
Contributor

Hi,

I had updated the log4j properties of Hbase from ambari server in hdp 2.4 cluster .

Then i do restart affected services,

then hbase master nodes connection refused .

please suggest how to solve this .

11821-hbase-master-con-refuesed.png

1 ACCEPTED SOLUTION

avatar
Super Collaborator
@amit Kumar

Connection refused would mean that port is not listening. Please verify on the host if port 16000 is in listen status

#netstat -an | grep 16000

Make sure that port is in Listen status for either 0.0.0.0:16000 or <IPofabovehost>:16000

View solution in original post

9 REPLIES 9

avatar
Contributor

Checking logs of Hbase master node :

vim /var/log/hbase/hbase-hbase-master-ods-node2.log

2017-01-27 10:48:15,699 INFO [localhost:16000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 472400 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2017-01-27 10:48:17,203 INFO [localhost:16000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 473904 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2017-01-27 10:48:18,707 INFO [localhost:16000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 475408 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2017-01-27 10:48:20,212 INFO [localhost:16000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 476913 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2017-01-27 10:48:21,716 INFO [localhost:16000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 478417 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.

Checking logs of region server node :

vim /var/log/hbase/hbase-hbase-master-ods-node3.log

2017-01-27 10:41:22,369 WARN [regionserver/localhost/127.0.0.1:16020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2017-01-27 10:41:25,369 INFO [regionserver/localhost/127.0.0.1:16020] regionserver.HRegionServer: reportForDuty to master=localhost,16000,1485513620999 with port=16020, startcode=1485513635588 2017-01-27 10:41:25,370 WARN [regionserver/localhost/127.0.0.1:16020] regionserver.HRegionServer: error telling master we are up com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2270) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:894) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupConnection(RpcClientImpl.java:410) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:716) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)

avatar
Explorer

It appears there is no connectivity between Region Servers and Master. Can you try the following?

ping both ways between region servers & master

tenet with port number from region servers to the master

If there are issues with the above validations, involve a network admin.

Thank you

Joginder

avatar
Super Collaborator
@amit Kumar

Connection refused would mean that port is not listening. Please verify on the host if port 16000 is in listen status

#netstat -an | grep 16000

Make sure that port is in Listen status for either 0.0.0.0:16000 or <IPofabovehost>:16000

avatar
Explorer

IS RS looping back to Master on the same localhost?

avatar
Contributor

no, there are not different aws instances .

avatar
Explorer

If they are on different nodes, they are both bound to their local hosts.. your external interface if not workind on each of the nodes

avatar
Explorer

need to bring the external interface up.. as it may be down and restart these service to let then bind to the non-localhost interface

avatar
Contributor

Initially Hbase services are working fine on all the client nodes , but this connection problem started when i do restart of hbase service after changes via ambari server . Same problem is not there in sandbox (you can start / stop any no of time) .

Is there any special instruction need to follow before doing start/ stop hbase (or any other service) on multi node aws cluster on HDP 2.4.0 ?

avatar
Explorer

After bringing network interface up: Did it work?