Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2482 | 02-12-2020 03:17 PM | |
1738 | 08-10-2017 09:42 AM | |
11546 | 07-28-2017 03:57 AM | |
2868 | 07-19-2017 02:43 AM | |
2116 | 07-13-2017 11:42 AM |
12-23-2016
05:49 AM
SYMPTOM: user complains that he is seeing some special characters (^@^@^@^@^@^@^) in GC logs similar to this concurrent-mark-sweep perm gen total 29888K, used 29792K [0x00000007e0000000, 0x00000007e1d30000, 0x0000000800000000)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2016-12-10T00:09:31.648-0500: 912.497: [GC2016-12-10T00:09:31.648-0500: 912.497: [ParNew: 574321K->15632K(629120K), 0.0088960 secs] 602402K->43713K(2027264K), 0.0090030 secs]
ROOT CAUSE: there was some query which was running on hiveserver2 in mr mode which spins Map Join actually forking a new process after inheriting the all java arguments was actually writing to same HiveServer2 GC files. this is introducing a special characters in the GC logs and also some of the GC events got skipped while writing to GC file. WORKAROUND: NA RESOLUTION: Run the query in Tez mode which will force the map join to run in to task container or set hive.exec.submit.local.task.via.child=false which will not fork a child process to run a local map task but this can be risky if somehow you mapjoin goes OOM then it can stall your service
... View more
Labels:
12-22-2016
04:42 PM
1 Kudo
SYMPTOM: The following jmx tool command is failing with "port already in user" error /usr/hdp/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec --jmx-url service:jmx:rmi:///jndi/rmi://`hostname`:1099/jmxrmi ROOT CAUSE: JMX tool invoke the kafka-run-class.sh after reading kafka-env.sh, since we have already have export $JMX_PORT=1099 there then it tried to bind the same host while invoking the jmx tool which result into error. WORKAROUND: NA RESOLUTION: edit kafka-run-class.sh Replace section
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
fi
To the following:
Changed security protocol to the correct ones and kafka cli script as follows.
# JMX port to use
if [ $ISKAFKASERVER = "true" ]; then
JMX_REMOTE_PORT=$JMX_PORT
else
JMX_REMOTE_PORT=$CLIENT_JMX_PORT
fi
if [ $JMX_REMOTE_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_REMOTE_PORT"
fi
... View more
Labels:
02-16-2017
07:42 PM
Does setting hive.mv.file.thread=0 reduce the performance of the insert query.Can you explain what does setting this configration has to do with HDP 2.5 upgrade? Is the move task parallelism not present in the previous version
... View more
12-22-2016
12:51 PM
1 Kudo
SYMPTOM: there are lots of heap dump file created under the directory controlled by java flag -XX:HeapDumpPath, user complains that HiveServer2 has generated these dump since this flag was setup by him for HiveServer2. looking at the HiveServer2 logs, there was no trace for startup and stop services. even there was no sign of any type of failure in the logs. ROOT CAUSE: At very preliminary analysis of heap dump I examine the process arguments which looks like this hive 12345 1234 9.8 27937992 26020536 ? Sl Dec01 19887:12 \_ /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx24576m -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhdp. version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library. path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net. preferIPv4Stack=true -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security.logger=INFO,NullAppender -Dhdp.version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive - Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64- 64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native:/usr/hdp/2.4.2.-258/hadoop/lib/native/Linux-amd... hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx24576m -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security. logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.4.2.-258/hive/lib/hive-common-1.2.1.2.3.4.74-1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/tmp/hive/261a57a5- caab-4f98-9fa2-f50209ba29e9/*****/-local-10006/plan.xml -jobconffile file:/tmp/hive/K*****/-local-10007/jobconf.xml the java process argument suggests that it is not hiveserver2, it looks like map side join spun by cliDriver. we looked at a hive-env file which seems messed up, specially HADOOP_CLIENT_OPTS if [ "$SERVICE" = "cli" ]; then
if [ -z "$DEBUG" ]; then
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
else
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive"
fi
fi
# The heap size of the jvm stared by hive shell script can be controlled via:
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
so after looking over we are sure that it was hive CLI process which spun Map Side join actually went into OOM not the Hiveserver2. WORKAROUND: NA RESOLUTION: Ask user to modify hive-env and set HADOOP_CLIENT_OPTS appropriately.
... View more
Labels:
08-27-2018
02:00 PM
got the same issues multiple times on our HDP 2.3 cluster.
... View more
12-22-2016
06:32 AM
2 Kudos
@Yukti Agrawal
On accessing the RM UI and click on the applicationId and corresponding logs link. It will take you to a screen something as below. Here you would see a file dag_*.dot, this should give you the query/MR graph that is being executed. Other option would be if the execution engine is TEZ, then you can leverage TEZ UI to view the actual query as well.
... View more
12-21-2016
05:58 PM
1 Kudo
SYMPTOM: hive metastore process is going down frequently with following exceptions: 2016-12-16 01:30:45,016 ERROR [Thread[Thread-5,5,main]]: thrift.TokenStoreDelegationTokenSecretManager (TokenStoreDelegationTokenSecretManager.java:run(331)) - ExpiredTokenRemover thread received unexpected exception. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:131)
at org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:76)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.removeExpiredTokens(TokenStoreDelegationTokenSecretManager.java:256)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager$ExpiredTokenRemover.run(TokenStoreDelegationTokenSecretManager.java:319)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
at org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
at org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:392)
at org.apache.hadoop.hive.metastore.ObjectStore.getToken(ObjectStore.java:6412)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
at com.sun.proxy.$Proxy0.getToken(Unknown Source)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:123)
... 4 more
2016-12-16 01:30:45,017 INFO [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5643)) - Shutting down hive metastore.
ROOT CAUSE: when hive metastore start, the DelegationTokenSecretManager will maintain the same objectstore,this lead to the concurrent issue. WORKAROUND: NA RESOLUTION: this concurrency issue has been fixed in HIVE-11616 and HIVE-13090 so need to apply the patches.
... View more
Labels:
12-21-2016
05:43 PM
1 Kudo
SYMPTOM: HiveServer2 is hung and is not able to execute simple query like show tables, during the investigation, we took some jstacks and realize that the following thread is processing very slow. "HiveServer2-Handler-Pool: Thread-86129" #86129 prio=5 os_prio=0 tid=0x00007f3ad9e1a800 nid=0x1003b runnable [0x00007f3a73b0a000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap$TreeNode.find(HashMap.java:1851)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979)
at java.util.HashMap.putVal(HashMap.java:637)
at java.util.HashMap.put(HashMap.java:611)
at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:290)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$JoinerPPD.process(OpProcFactory.java:464)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10167)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:406)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:290)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
- locked <0x00000005c1e1bc18> (a java.lang.Object)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375)
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy45.executeStatementAsync(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: the compilation stage of a query in hiveserver2 is single threaded before Hive-2, so only one query is able to compile during that stage and other queries will remain in the wait state. we observe that during compilation this thread is executing expression factory for predicate pushdown processing, in which Each processor determines whether the expression is a possible candidate for predicate pushdown optimization for the given operator. the user was running a huge query at that time with which consist of more than 300+ 'case and then condition' which is taking too long. WORKAROUND: ask customer to set hive.optimize.ppd=false at session level while running this query and ask them to rewrite the sql in more optimized way. RESOLUTION: set hive.optimize.ppd=false at session level
... View more
Labels:
12-21-2016
05:24 PM
1 Kudo
SYMPTOM: hiveserver2 is too slow to respond to the simple query and taking lot of time to complete or something unresponsive. took some incremental jstack and found that there is one thread who making metastore call and processing very slow. Thread 24233: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int) @bci=8, line=116 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=170 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=141 (Compiled frame)
- oracle.net.ns.Packet.receive() @bci=157, line=311 (Compiled frame)
- oracle.net.ns.DataPacket.receive() @bci=1, line=105 (Compiled frame)
- oracle.net.ns.NetInputStream.getNextPacket() @bci=48, line=305 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[], int, int) @bci=33, line=249 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[]) @bci=5, line=171 (Compiled frame)
- oracle.net.ns.NetInputStream.read() @bci=5, line=89 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket() @bci=11, line=123 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.read() @bci=55, line=84 (Compiled frame)
- oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1() @bci=6, line=429 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.receive() @bci=16, line=397 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.doRPC() @bci=116, line=257 (Compiled frame)
- oracle.jdbc.driver.T4C8Oall.doOALL(boolean, boolean, boolean, boolean, boolean, oracle.jdbc.internal.OracleStatement$SqlKind, int, byte[], int, oracle.jdbc.driver.Accessor[], int, oracle.jdbc.driver. Accessor[], int, byte[], char[], short[], int, oracle.jdbc.driver.DBConversion, byte[], java.io.InputStream[][], byte[][][], oracle.jdbc.oracore.OracleTypeADT[][], oracle.jdbc.driver.OracleStatement, byte[], char[], short[], oracle.jdbc.driver.T4CTTIoac[], int[], int[], int[], oracle.jdbc.driver.NTFDCNRegistration, oracle.jdbc.driver.ByteArray, long[], int[], boolean) @bci=903, line=587 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean, int) @bci=780, line=225 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean) @bci=23, line=53 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe() @bci=37, line=774 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.executeMaybeDescribe() @bci=106, line=925 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout() @bci=250, line=1111 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeInternal() @bci=145, line=4798 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeQuery() @bci=18, line=4845 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery() @bci=4, line=1501 (Compiled frame)
- com.jolbox.bonecp.PreparedStatementHandle.executeQuery() @bci=68, line=174 (Compiled frame)
- org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery() @bci=4, line=375 (Compiled frame)
- org.datanucleus.store.rdbms.SQLController.executeStatementQuery(org.datanucleus.ExecutionContext, org.datanucleus.store.connection.ManagedConnection, java.lang.String, java.sql.PreparedStatement) @ bci=120, line=552 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.JoinListStore.listIterator(org.datanucleus.state.ObjectProvider, int, int) @bci=329, line=770 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.listIterator(org.datanucleus.state.ObjectProvider) @bci=4, line=93 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.iterator(org.datanucleus.state.ObjectProvider) @bci=2, line=83 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.loadFromStore() @bci=77, line=264 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.iterator() @bci=8, line=492 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToFieldSchemas(java.util.List) @bci=21, line=1199 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor, boolean) @bci=39, line=1266 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor) @bci=3, line=1281 (Compiled frame)
ROOT CAUSE: After further investigation, we found that metastore db was running outside of the cluster and shared by other applications.when there was a load on metastore database then it is taking too long to respond to the query. since this is happening in the compilation stage of the query which is single threaded hence the other incoming query which lands on Hiveserver2 will wait until this query is done with the compilation. WORKAROUND: NA RESOLUTION: Run Metastore database inside the cluster and don't share it with the other applications if you are running huge workload on HiveServer2. please check network between HiveServer2 node and MySQL node to see any bottleneck.
... View more
Labels:
12-21-2016
05:07 PM
1 Kudo
SYMPTOM: beeline connection to hiveserver2 is failing with the following exception intermittently. "The initCause method cannot be used. To set the cause of this exception, use a constructor with a Throwable[] argument." further looking at the HiveServer2 logs we saw stack trace like this ava.sql.SQLException: Could not retrieve transation read-only status server
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2259)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getProductName(MetaStoreDirectSql.java:171)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.determineDbType(MetaStoreDirectSql.java:152)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:122)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:300)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:263)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy7.setConf(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.setConf(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:56)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5798)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 35 more
Caused by: java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1086)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3972)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3943)
at com.jolbox.bonecp.ConnectionHandle.isReadOnly(ConnectionHandle.java:867)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:422)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getNucleusConnection(RDBMSStoreManager.java:1382)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2245)
... 51 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 381,109 milliseconds ago. The last packet sent successfully to the server was
381,109 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3988)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2598)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2828)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2777)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1651)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3966)
... 56 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3969)
ROOT CAUSE: After analyzing TCP DUMP we found that it's MySQL server who is dropping network packets very frequently. WORKAROUND: NA RESOLUTION: need to check and fix the network issue between HiveServer2 host and MySQL host.
... View more
Labels: