- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 12-21-2016 05:43 PM
SYMPTOM:
HiveServer2 is hung and is not able to execute simple query like show tables, during the investigation, we took some jstacks and realize that the following thread is processing very slow.
"HiveServer2-Handler-Pool: Thread-86129" #86129 prio=5 os_prio=0 tid=0x00007f3ad9e1a800 nid=0x1003b runnable [0x00007f3a73b0a000] java.lang.Thread.State: RUNNABLE at java.util.HashMap$TreeNode.find(HashMap.java:1851) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.find(HashMap.java:1861) at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979) at java.util.HashMap.putVal(HashMap.java:637) at java.util.HashMap.put(HashMap.java:611) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:290) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$JoinerPPD.process(OpProcFactory.java:464) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10167) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:406) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:290) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110) - locked <0x00000005c1e1bc18> (a java.lang.Object) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375) at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy45.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE:
the compilation stage of a query in hiveserver2 is single threaded before Hive-2, so only one query is able to compile during that stage and other queries will remain in the wait state. we observe that during compilation this thread is executing expression factory for predicate pushdown processing, in which Each processor determines whether the expression is a possible candidate for predicate pushdown optimization for the given operator. the user was running a huge query at that time with which consist of more than 300+ 'case and then condition' which is taking too long.
WORKAROUND:
ask customer to set hive.optimize.ppd=false at session level while running this query and ask them to rewrite the sql in more optimized way.
RESOLUTION:
set hive.optimize.ppd=false at session level