Created on 12-24-2016 05:08 PM
SYMPTOM:
hive CLI is in hung state for so long, an impatient user did CRTL+C to exit of it and complain about the hive CLI slowness.
ROOT CAUSE:
user is running hive CLI in Kerberos enabled security, we asked them to enable debug logging at the console using hive --hiveconf hive.root.logger=debug,console and saw the following GSS exception due to ticket expiration.
WARN hive.metastore: Failed to connect to the MetaStore Server... org.apache.thrift.transport.TTransportException: GSS initiate failed at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:426) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1531) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3000) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3019) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1237) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:484) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
during hive cli startup it tries to connect to metastore but do not have a valid TGT hence failing with the GSS exception.
WORKAROUND:
NA
RESOLUTION:
to fail it fast we can use following properties in hive configuration
hive.metastore.connect.retries- no of times Client will try to connect to Metastore by default it is set to 24. hive.metastore.client.connect.retry.delay- the delay after failure this is by default set to 5s.
