10-21-2013 10:56 AM
Seeing this error on a 8 node cluster. Able to connect from the node where zookeeper is deployed.
2013-10-18 17:01:08,656 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: getMaster attempt 0 of 10 failed; retrying after sleep of 1009 org.apache.hadoop.hbase.MasterNotRunningException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:706) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:126) at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.<init>(ThriftServerRunner.java:513) at org.apache.hadoop.hbase.thrift.ThriftServerRunner.<init>(ThriftServerRunner.java:228) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:100) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:237)
10-21-2013 12:26 PM
Can you give a bit more detail as to what you are doing when you encounter this error? And is the machine where you are seeing this one of those 8 nodes in the cluster? Or an external machine?
I've seen this before when a client app outside the cluster was unable to connect to the zookeeper quorum because a local copy of the hbase-site.xml file was not in the application's path therefore it did not know who the zookeeper servers were and the error looks like yours. The property that needs to be specified for the client is: hbase.zookeeeper.quorum.
10-21-2013 01:37 PM
It is one of the 8 nodes. We see the same issue when running Hive and Impala queries from Hue on the node where Zookeeper is not running.
So we have Zookeeper on Nodes 3,4,5. Hue is running on Node 1. We see this issue executing Hive query in Hue.
10-21-2013 01:54 PM
do you have the zookeeper.quorum property I mentioned previously in your /etc/hbase/conf/hbase-site.xml file on these systems? It sounds like your hbase clients (any app trying to access the HBase service) are trying to use the default property for the ZK quorum, which would have them looking on the localhost for a ZK server. This is why it works on nodes that are running a ZK instance. You need a valid hbase-site.xml file on each node that specifies the ZK quorum. It was described in that link I posted. I hope that helps.
10-21-2013 02:28 PM
thanks for your response. On each node we have the following in hbase-site.xml. I changed the domain to zzz here
10-21-2013 02:36 PM
I need one clarification. I see this configuration in /etc/hbase/conf/hbase-site.xml
Added two more zookeepers thru cloudera manager services. when I view hbase-> instances -> region servers -> process -> show/hide configs/hbase-site.xml I see the 5 zookeeper hosts. But /etc/hbase/conf still has 3. Does Cloudera use /etc/hbase/conf/hbase-site.xml by default or does it store the updated hbase-site.xml some where else?
10-22-2013 09:32 AM
We got help from support to solve this. Here the solution. Solution is temporary for that hive session.
hive> set hbase.zookeeper.quorum=list of host names
hive> ADD JAR /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.4.0.jar
hive> ADD JAR /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/guava-11.0.2.jar;
hive> ADD JAR /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/zookeeper.jar;
hive> ADD JAR /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hbase/hbase-0.94.6-cdh4.4.0-security.jar;