Created 12-08-2017 05:10 PM
I'm getting a vertex failed error when I'm trying to run a query using the Hive interactive site. The actual error is a NoSuchMethodError for the org.apache.hadoop.ipc.RemoteException, but I'm not sure if that's the actual error or not. The query is joining 3 large tables together and It works fine if I just query one of the tables, but as soon as I join one of them together it fails with the below error. Most of the vertex failed questions I've found online have to do with memory, but their error message says something about memory in the error. Mine does not and I've tried doing all of the recommendation for their issues without any different result. The query with the joins seems to work if I turn off LLAP, but it takes a really long time and I want to be able to use this feature if possible. Does anyone know what might be the issue? I'm stuck on this one.
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 3, vertexId=vertex_1512748246177_0021_2_01, diagnostics=[Task failed, taskId=task_1512748246177_0021_2_01_000001, diagnostics=[TaskAttempt 0 failed, info=[org.apache.hadoop.ipc.RemoteException(java.lang.NoSuchMethodError): org.apache.log4j.MDC.put(Ljava/lang/String;Ljava/lang/String;)V at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:214) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:547) at org.apache.hadoop.hive.llap.daemon.impl.LlapProtocolServerImpl.submitWork(LlapProtocolServerImpl.java:101) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:16728) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345)
Created 12-08-2017 08:04 PM
This is likely because log4j-1.2.x.jar is in your classpath somewhere which is getting bundled up in LLAP package. Hive2 and LLAP moved to log4-2.x which is not compatible with old log4j.. If you have HADOOP_USER_CLASSPATH_FIRST set to true in your environment, old log4j should not be picked. It could also be that hive libs directory is transitively pulling old log4j.
When LLAP bundles a package only the following log4j 2.x jars are expected
log4j-1.2-api-2.8.2.jar (bridge jar)
log4j-api-2.8.2.jar
log4j-core-2.8.2.jar
log4j-jul-2.5.jar
log4j-slf4j-impl-2.8.2.jar
log4j-web-2.8.2.jar
Can you check if your hadoop classpath is having log4j-1.2.x.jar by any chance? If so please back that up somewhere/remove and retry.
Created 12-12-2017 04:46 PM
I do have HADOOP_USER_CLASSPATH_FIRST set to true. How do I find where the Hadoop classpath is? In the Hadoop_Env file it's just set as HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS}