Support Questions

Find answers, ask questions, and share your expertise

Zookeeper client connecting to localhost:2181 instead of Zookeeper quorum hosts

avatar
Contributor

Hi All,

 

I am using Hue--> Hive editor to submit a query on an external table and View created on top an Hbase table. I have 3 instances of Hue running in my cluster(1 among them as Load balancer).

 

1.When a submit a query from Beeline on this external table and View. It works perfectly fine.

2. When i submit the same query from Hue it doesn't work.(simple query like: select * from hbase_view limit 10)

3. hbase-site.xml, hive-site.xml all have zookeeper.quorum defined correctly(I have 5 zookeeper server instances, so it has 5 nodes in zookeeper.quorum properties). Clientport:2181. That also looks fine.

 

But I am getting below error in hive-server2.log file when a query is submitted from Hue.

 

Instead of trying to connect to one of the zookeeper host, it is trying to connect to localhost4/127.0.0.1:2181. This is not the case when query runs successfully, it takes any of the zookeeper node for client connection.

 

2019-08-13 22:10:23,353 INFO  org.apache.zookeeper.ClientCnxn: [HiveServer2-Background-Pool: Thread-8315-SendThread(localhost4:2181)]: Opening socket connection to server localhost4/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)

2019-08-13 22:10:23,353 WARN  org.apache.zookeeper.ClientCnxn: [HiveServer2-Background-Pool: Thread-8315-SendThread(localhost4:2181)]: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused

        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)

        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

 

 

 

2019-08-13 22:10:27,855 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: [HiveServer2-Background-Pool: Thread-8315]: ZooKeeper getData failed after 4 attempts

2019-08-13 22:10:27,855 WARN  org.apache.hadoop.hbase.zookeeper.ZKUtil: [HiveServer2-Background-Pool: Thread-8315]: hconnection-0x5bd726980x0, quorum=localhost:2181, baseZNode=/hbase Unable to get data of znode /hbase/meta-region-server

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server

        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)

        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)

 

Eventually the query fails without even generating a mapreduce application/job:

 

2019-08-13 22:10:46,572 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-8315]: Job Submission failed with exception 'org.apache.hadoop.hbase.client.RetriesExhaustedException(Can't get the locations)'

org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations

        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:329)

        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:157)

        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)

        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210)

 

why is it trying to connect to localhost:2181 instead of zookeeper hosts? Any solutions for this problem?

 

Regards,

Nanda

 

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi, @nanda_bigdata ,

 

In addition, could you please also check the following:

  • In CM > Hive > Configuration, ensure that the HBase Service property is set to the existing HBase service
  • In CM > HBase > Instances, ensure you have Gateway roles assigned to all the hosts in your cluster. If not, add them.

Restart the necessary services and deploy client configuration if you make any changes above and try agian.

 

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

5 REPLIES 5

avatar
Super Guru
Hi Nanda,

Are you using Cloudera Manager? If yes, can you go to CM > Hue > Instances > Hue Server > Processes, and then check the hbase-conf/hbase-site.xml and hive-conf/hive-site.xml files and confirm if ZK configurations are set properly there?

Are all Hue instances having the same issue?

Thanks
Eric

avatar
Contributor

Hi Eric,

I just checked hbase-site.xml,hive-site.xml from CM->Hue->Instances-> processes tab as well.

They all look good. Below is an example from hive-site.xml, it is the same in hbase-site.xml as well.

 

This problem is not happening on all 3 Hue Web UI, for some users it is working in Hue1 and not working in other two, for some users it works in Hue1 and Hue2, But not in Hue3.

 

I have a load balanced Hue(recommended one to use among 3 Hue instances) . Here it is not working for 90% of the users including my ID). Are we hitting maximum client connections to Zookeeper (maxClientCnxns=60 in my cluster). If that is the case, I don't even see any errors in zookeeper logs saying "too many connections from <IP_address> max is 60" etc.,

 

Error is same for all users. Unable to submit mapreduce job "can't get locations of Hbase regions/data" and client trying to connect to zookeeper on localhost:2181 instead of actual zookeeper nodes.

 

<name>hive.zookeeper.quorum</name>
    <value>ZK1,ZK2,ZK3,ZK4,ZK5</value>
  </property>
  <property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>

 

avatar
Contributor

This is how my zoo.cfg looks :

 

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/bdp/znode/cdh
dataLogDir=/bdp/znode/cdh
clientPort=2181
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=60000
autopurge.purgeInterval=24
autopurge.snapRetainCount=5
quorum.auth.enableSasl=false
quorum.cnxn.threads.size=20
server.1=ZK1:3181:4181
server.2=ZK2:3181:4181
server.3=ZK3:3181:4181
server.4=ZK4:3181:4181
server.5=ZK5:3181:4181
leaderServes=yes
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true

avatar
Super Guru

Hi, @nanda_bigdata ,

 

In addition, could you please also check the following:

  • In CM > Hive > Configuration, ensure that the HBase Service property is set to the existing HBase service
  • In CM > HBase > Instances, ensure you have Gateway roles assigned to all the hosts in your cluster. If not, add them.

Restart the necessary services and deploy client configuration if you make any changes above and try agian.

 

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Contributor

Hi Andre,

 

Your solution is right.

But my situation was little different.

Below are the checks and fix I did with cloudera support helping me in the process:

 

1. From Hive-server2 logs we found that one of the Hiveserver2 instance is not talking to zookeeper quorum(only in case of querying Hbase data)

2. Installed Hbase-gateway services on all the Hue instances and Hiveserver2 instances.

3. restart Hbase services and Deploy client configuration.

4. Restart the Hiveserver2 instance which had the problem of trying to connect to localhost:2181 as zookeeper quorum

 

Then tried to submit the query from beeline and Hue . All worked as expected this time.