Support Questions

Find answers, ask questions, and share your expertise

HiveServer2 - Large User Query caused hang

avatar
Contributor

A user was running a large query and caused our Hive Server to become hung. While the hive cli would still come up, if we tried to run a simple 'select 1;' query it would never return. If we tried to run something via JDBC, we would get a transport exception. Below is a stack trace that we were able to get. How do we prevent this from happening in the future? We eventually restarted Hive to get back into an operational state.

at org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.isCandidate(ExprWalkerInfo.java:165)
        at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractFinalCandidates(ExprWalkerProcFactory.java:330)
        at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:299)
        at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746)
        at org.apache.hadoop.hive.ql.ppd.OpProcFactory$FilterPPD.process(OpProcFactory.java:437)
        at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
        at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135)
        at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10189)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
        - locked <0x00000000c0ce9d60> (a java.lang.Object)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1116)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


1 ACCEPTED SOLUTION

avatar
Master Mentor
@Jesse Passwaters

You must setup yarn queues to enable true multi-tenancy "if you don't have queues" so that 1 job wont block the whole cluster.

Need more data from stack space.

View solution in original post

4 REPLIES 4

avatar
Master Mentor
@Jesse Passwaters

You must setup yarn queues to enable true multi-tenancy "if you don't have queues" so that 1 job wont block the whole cluster.

Need more data from stack space.

avatar
Contributor

@Neeraj Sabharwal - notice the locked thread. What else would you be looking for from the stack?

avatar
Master Mentor

avatar
Contributor

Thanks, @Neeraj Sabharwal! I will definitely look into that.