Created on 04-15-2019 08:04 AM - edited 09-16-2022 07:18 AM
Hi! Some time ago we decided to move few of our udf packages to auxlib reloadable directory to enable updating some functions without restart of Hiveservers. Since then however we experience random errors like shown in following example:
2019-04-04 22:33:18,103 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Handler-Pool: Thread-117]: Completed compiling command(queryId=hive_20190404223333_ed9b3085-fc91-42b1-9ca4-5224cd838aec); Time taken:
0.481 seconds
2019-04-04 22:33:18,103 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Handler-Pool: Thread-117]: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2019-04-04 22:33:18,103 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Handler-Pool: Thread-117]: </PERFLOG method=releaseLocks start=1554409998103 end=1554409998103 duration=0 from=org.apache.hado
op.hive.ql.Driver>
2019-04-04 22:33:18,105 INFO org.apache.hive.service.cli.operation.OperationManager: [HiveServer2-Handler-Pool: Thread-117]: Closing operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=3
8b7d0d3-d925-48b1-bdb7-b7c3334cc7d8]
2019-04-04 22:33:18,109 WARN org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-117]: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException Generate Map Join Task Error: Unable to find class: XXXXX
Serialization trace:
genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
colExprMap (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.JoinOperator)
reducer (org.apache.hadoop.hive.ql.plan.ReduceWork)
reduceWork (org.apache.hadoop.hive.ql.plan.MapredWork)
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:187)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:271)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:337)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:439)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:416)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:282)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:501)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:763)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Generate Map Join Task Error: Unable to find class: XXXXX
Serialization trace:
genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
colExprMap (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.JoinOperator)
reducer (org.apache.hadoop.hive.ql.plan.ReduceWork)
reduceWork (org.apache.hadoop.hive.ql.plan.MapredWork)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinTaskDispatcher.processCurrentTask(CommonJoinTaskDispatcher.java:516)
at org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:179)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:79)
at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
at org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:273)
at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:225)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10315)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10108)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:223)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:558)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1356)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1343)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:185)
... 15 more
XXXX denotes here the class name. Have you seen such erratic behaviour before? The eror occurs for functions in one package only, although there are 4 packages in reloadable directory. The only difference between them is that the malfunctioning is significantly bigger than the other (it's shaded fat jar).
There are some network resources suggesting the size may be a problem, but it isn't the same problem as ours:
Is there anything we could do to investigate this case further and get more information about the problem?
Created 05-07-2019 09:24 PM
Created on 05-08-2019 12:20 PM - edited 05-08-2019 12:21 PM
Hi Eric,
The size of the said jar package was around 20MB, and /tmp partition on all hive servers had lots of free space (say gigabytes) at any point of time we saw the problem. Finally we decided to remove this jar from 'reloadable' directory, so the case is still unsolved. Maybe someone else will have some neat idea too. Thanks for help, cheers!
Created 05-08-2019 11:31 PM
Created 05-10-2019 01:59 AM
Hi,
this perhaps could serve as explanation, but in our case we didn't make any changes after initial deployment of jars, and the problem still persisted for a week or so. Some additional piece: we used soft links instead of actual files (which were stored in different folder). I'm wondering if perhaps it could cause some short 'unavailability' of these files to Cloduera.
Created 05-10-2019 04:05 AM
Created 07-06-2020 09:54 AM
Just wondering if you found a workaround for this? I think this is a known bug in Hive 1.1, but unfortunately upgrading Hive is not an option for us right now.