Created on 04-26-2015 08:52 PM - edited 09-16-2022 02:27 AM
I was able to to setup CDH 5.3 cluster on Amazon AWS using Cloudera Director. The cluster was up fine with these main services: HDFS, YARN, HBase, Zookeeper and Oozie. I was able to connect to HBase and create tables fine. I also was able import data into HBase from other cluster fine.
When I run a Hbase process as oozie job, I ran into issues that the job could not connect to Hbase and it was related to Zookeeper quorum. Somehow the erorr said the quorum=BROKEN:2181 instead of the the quorum with 3 nodes that our cluster.
Hopefully someone could help to point me to the directory how to fix this.
Thanks
Below are some errors:
2015-04-25 08:43:31,144 WARN [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=BROKEN:2181, exception=org.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout 2015-04-25 08:43:31,144 INFO [main] org.apache.hadoop.hbase.util.RetryCounter: Sleeping 8000ms before retry #3... 2015-04-25 08:43:39,144 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=BROKEN:2181 sessionTimeout=180000 watcher=hconnection-0x5bef45fa, quorum=BROKEN:2181, baseZNode=/hbase 2015-04-25 08:43:39,145 WARN [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Unable to create ZooKeeper Connection java.net.UnknownHostException: BROKEN at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getAllByName0(InetAddress.java:1246) at java.net.InetAddress.getAllByName(InetAddress.java:1162) at java.net.InetAddress.getAllByName(InetAddress.java:1098) at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:140) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:642) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:411) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:390) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:271) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:198) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:160) at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:586) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:606) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:490) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:564) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:559) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:559) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:550) at org.apache.oozie.action.hadoop.MapReduceMain.submitJob(MapReduceMain.java:111) at org.apache.oozie.action.hadoop.MapReduceMain.run(MapReduceMain.java:72) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) at org.apache.oozie.action.hadoop.MapReduceMain.main(MapReduceMain.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2015-04-25 08:43:39,146 WARN [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=BROKEN:2181, exception=org.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout 2015-04-25 08:43:39,146 ERROR [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts 2015-04-25 08:43:39,146 WARN [main] org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x5bef45fa, quorum=BROKEN:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid) org.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:143) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:642) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:411) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:390) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:271) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:198) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:160) at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:586) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:606) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:490) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:564) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:559) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:559) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:550)
Created 05-13-2015 08:21 PM
I was able to fix the issue. We use chef to setup hbase configuration on worker node, but there were problem with chef setting which caused the missing hbase connection setting on the worker node. After I fix chef, the hbase connection was setup fine.
Thanks
Created 05-13-2015 06:05 PM
Have you included your hbase configuration on your path when you start the job?
The hbase configuration is needed for the job to run. There are multiple ways that you can do this and we have a Knowledge Base article available for it if you are a subscription customer. Otherwise check out the standard documentation:
In the HBASE doc check for the comments in the examples they all mention it.
Wilfred
Created 05-13-2015 08:21 PM
I was able to fix the issue. We use chef to setup hbase configuration on worker node, but there were problem with chef setting which caused the missing hbase connection setting on the worker node. After I fix chef, the hbase connection was setup fine.
Thanks