Support Questions

Find answers, ask questions, and share your expertise

Hbase - Zookeeper connection issue

avatar
Contributor

Hi Team,

 

Spark job is hang or struck due to below error. can any one please help here..

 

22/12/05 22:29:55 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x5e29988e0x0, quorum=localhost:2181, baseZNode=/hbase
22/12/05 22:29:55 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
22/12/05 22:29:55 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1126)

....

 

22/12/05 22:30:12 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
22/12/05 22:30:12 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
22/12/05 22:30:12 WARN zookeeper.ZKUtil: hconnection-0x5e29988e0x0, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hi @hanumanth , As you can see, the spark job is trying to reach the zookeeper on the localhost. 

22/12/05 22:30:12 WARN zookeeper.ZKUtil: hconnection-0x5e29988e0x0, quorum=localhost:2181

 

We expect a zookeeper quorum of 3 or more ZK under quorum=. So this indicates that the node on which the spark job is running doesn't have a hbase-site.xml to direct the job to use the hbase.zookeeper.quorum.

 

So make sure you have a Hbase Gateway role deployed on the node from where you are running the spark job and also try running the job if spark-submit using for eg "--files /etc/spark/conf/yarn-conf/hbase-site.xml"

View solution in original post

4 REPLIES 4

avatar
Contributor

Hello @hanumanth , 

 

If Zookeeper services look up and running, you may need to compare the Spark job failure timestamp against Zookeeper logs from the Leader sever. if there is not a visible issue from Zookeeper side you can check if the hbase client configurations were applied properly in the spark job configurations.

 

Also, confirm that the Hbase service is up and functional as well. 

 

If the above does not help, you may want to raise a support ticket with the Spark component. 

avatar
Contributor

Hi All,

 

As CDH version was 5.16.1 out of support , i am unable to contact support. please help here.

avatar
Super Collaborator

Hi @hanumanth , As you can see, the spark job is trying to reach the zookeeper on the localhost. 

22/12/05 22:30:12 WARN zookeeper.ZKUtil: hconnection-0x5e29988e0x0, quorum=localhost:2181

 

We expect a zookeeper quorum of 3 or more ZK under quorum=. So this indicates that the node on which the spark job is running doesn't have a hbase-site.xml to direct the job to use the hbase.zookeeper.quorum.

 

So make sure you have a Hbase Gateway role deployed on the node from where you are running the spark job and also try running the job if spark-submit using for eg "--files /etc/spark/conf/yarn-conf/hbase-site.xml"

avatar
Community Manager

@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  Thanks


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: