Created 02-11-2016 01:24 AM
A user was running a large query and caused our Hive Server to become hung. While the hive cli would still come up, if we tried to run a simple 'select 1;' query it would never return. If we tried to run something via JDBC, we would get a transport exception. Below is a stack trace that we were able to get. How do we prevent this from happening in the future? We eventually restarted Hive to get back into an operational state.
at org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.isCandidate(ExprWalkerInfo.java:165) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractFinalCandidates(ExprWalkerProcFactory.java:330) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:299) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$FilterPPD.process(OpProcFactory.java:437) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10189) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) - locked <0x00000000c0ce9d60> (a java.lang.Object) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1116) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Created 02-11-2016 01:26 AM
You must setup yarn queues to enable true multi-tenancy "if you don't have queues" so that 1 job wont block the whole cluster.
Need more data from stack space.
Created 02-11-2016 01:26 AM
You must setup yarn queues to enable true multi-tenancy "if you don't have queues" so that 1 job wont block the whole cluster.
Need more data from stack space.
Created 02-11-2016 01:35 AM
@Neeraj Sabharwal - notice the locked thread. What else would you be looking for from the stack?
Created 02-11-2016 02:27 AM
@Jesse Passwaters Sorry as I don't have much insight on that but I do know how to avoid this happening
Demo: https://community.hortonworks.com/articles/10912/yarn-queues-and-cs-view-queue-mapping.html
Created 02-11-2016 02:29 AM
Thanks, @Neeraj Sabharwal! I will definitely look into that.