Created 12-23-2016 07:02 AM
Hive server 2 goes down in our environment within 5 minutes of bringing it up with the following error, any thoughts
2016-12-22 13:17:31,526 FATAL [main]: server.HiveServer2 (HiveServer2.java:addServerInstanceToZooKeeper(236)) - Unable to create a znode for this server instance java.lang.Exception: Max znode creation wait time: 120s exhausted at org.apache.hive.service.server.HiveServer2.addServerInstanceToZooKeeper(HiveServer2.java:225) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:417) at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:78) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:654) at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:527) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 2016-12-22 13:17:31,531 INFO [main]: server.HiveServer2 (HiveServer2.java:stop(371)) - Shutting down HiveServer2 2016-12-22 13:17:31,531 INFO [main]: thrift.ThriftCLIService (ThriftCLIService.java:stop(199)) - Thrift server has stopped 2016-12-22 13:17:31,531 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:ThriftBinaryCLIService is stopped. 2016-12-22 13:17:31,532 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped. 2016-12-22 13:17:31,532 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped. 2016-12-22 13:17:41,533 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped. 2016-12-22 13:17:41,534 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped. 2016-12-22 13:17:41,544 INFO [main]: zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x358a9c9199506a2 closed 2016-12-22 13:17:41,545 INFO [main]: server.HiveServer2 (HiveServer2.java:removeServerInstanceFromZooKeeper(338)) - Server instance removed from ZooKeeper. 2016-12-22 13:17:41,545 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(442)) - Error starting HiveServer2 on attempt 1, will retry in 60 seconds java.lang.Exception: Max znode creation wait time: 120s exhausted at org.apache.hive.service.server.HiveServer2.addServerInstanceToZooKeeper(HiveServer2.java:225) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:417) at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:78) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:654) at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:527) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 2016-12-22 13:17:41,546 INFO [main-EventThread]: zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down 2016-12-22 13:18:33,625 INFO [Thread-4]: server.HiveServer2 (HiveStringUtils.java:run(709)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down HiveServer2 at abc.solutions.local/172.16.3.196 ************************************************************/ 2016-12-22 13:18:33,631 INFO [Thread-7]: server.HiveServer2 (HiveServer2.java:stop(371)) - Shutting down HiveServer2 2016-12-22 13:18:33,632 INFO [Thread-7]: server.HiveServer2 (HiveServer2.java:removeServerInstanceFromZooKeeper(338)) - Server instance removed from ZooKeeper
Created 12-23-2016 07:08 AM
it looks hiveserver2 is not able to create zk node in zookeeper, it could be issue at zookeeper side
could you please check whether zk node got created or not using
/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181
ls /hiveserver2
see if you have issue with zookeeper server.
Created 12-23-2016 07:08 AM
it looks hiveserver2 is not able to create zk node in zookeeper, it could be issue at zookeeper side
could you please check whether zk node got created or not using
/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181
ls /hiveserver2
see if you have issue with zookeeper server.
Created 12-23-2016 07:18 AM
hi @Rajkumar Singh, this is what i see in the ZK
[zk: localhost(CONNECTED) 1] ls /hiveserver2 [serverUri=abcd.solutions.local:10000;version=1.2.1000.2.4.0.0-169;sequence=0000000000]
Created 12-23-2016 07:22 AM
is this a standalone zk or running zk quorum?
Created 12-23-2016 07:34 AM
could you please attach complete hiveserver2 logs, it looks hs2 is not able to create zknode, this could be kerberos issue if you running in secure env or there might be zookeeper database inconsisency.
to rule out zk database inconsistencies you can try
stop all services which are using zk
change ZooKeeper directory location after browsing ambari=> zookeeper=> conf, by default it is /hadoop/zookeeper
start all reservices
Created 12-23-2016 07:45 AM
@Rajkumar Singh, i removed the znode /hiveserver2 from zk after stopping hive service. restarted again.it is running fine for more than 20 mins. Let me see what happens. Thanks for your help
Created 12-23-2016 07:11 AM
Looks like there is an issue with zookeeper. Can you please share zookeeper logs and look for any errors
Created 12-23-2016 07:26 AM
it is a quorum of 3 servers
Created 03-17-2017 11:07 AM
Login to zk command line using zkCli.sh and after logging in check the contents of Hiveserver2 znode and in my case it is [ ], empty.
Delete the HS2 znode using del command and restart HS2 again.
Created 05-09-2017 11:20 AM
Yes, it is working after Delete the HS2 znode using del command and restart HS2.