Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2930 | 02-12-2020 03:17 PM | |
| 2138 | 08-10-2017 09:42 AM | |
| 12481 | 07-28-2017 03:57 AM | |
| 3426 | 07-19-2017 02:43 AM | |
| 2528 | 07-13-2017 11:42 AM |
12-22-2016
05:42 AM
@ARUN yes you can still retrieve application stats from application timeline server(ATS)
... View more
12-22-2016
05:33 AM
@sathish jeganathan kafka connect is good if you have source and destination kafka cluster, this way you can stream the data between both clusters, I have not tried it so can't comment on pros/cons of it but certainly a NiFi has quite rich set of processor with the help of these you can stream the data along with transformation/enrichment of the data
... View more
12-22-2016
05:28 AM
2 Kudos
@ARUN you can decrease the no of application display in yarn web ui controlled by yarn.resourcemanager.max-completed-applications parameters,ResourceManager will store by default 10000 completed applications in memory which could slow down UI
... View more
12-22-2016
05:07 AM
@Huahua Wei could you please share the output of the command lsof -p <pid of zookeeper> |grep log lsof -p <pid of zookeeper> | grep out
... View more
12-22-2016
05:04 AM
@sathish jeganathan for such data ingestion case I will suggest you to use apache nifi, you can use putkafka and puthdfs processor, with the help of these processor you can write your files directly on hdfs. to refer putHDFS you can follow this document https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/
... View more
12-21-2016
07:11 PM
3 Kudos
SYMPTOM: Problem was seen when the client was running Metastore HA and both configured with org.apache.hadoop.hive.ql.txn.compactor.Initiator. this is what we observed from the logs of both metastore service // metastore 1
ERROR compactor.Worker (Worker.java:run(181)) - Caught an exception in the main loop of compactor worker lnxhdpap02.smrcy.com-33, MetaException(message:Unable to connect to transaction database com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
// metastore 2
ERROR txn.CompactionTxnHandler (CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next element for compaction, Deadlock found when trying to get lock; try restarting transaction
ROOT CAUSE: org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs inside the metastore service to manage compactions of ACID tables. There should be exactly 1 instance of this thread (even with multiple Thrift services). WORKAROUND: NA RESOLUTION: set “hive.compactor.initiator.on" only for single instance of metastore service.
... View more
Labels:
12-21-2016
05:58 PM
1 Kudo
SYMPTOM: hive metastore process is going down frequently with following exceptions: 2016-12-16 01:30:45,016 ERROR [Thread[Thread-5,5,main]]: thrift.TokenStoreDelegationTokenSecretManager (TokenStoreDelegationTokenSecretManager.java:run(331)) - ExpiredTokenRemover thread received unexpected exception. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:131)
at org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:76)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.removeExpiredTokens(TokenStoreDelegationTokenSecretManager.java:256)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager$ExpiredTokenRemover.run(TokenStoreDelegationTokenSecretManager.java:319)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
at org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
at org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:392)
at org.apache.hadoop.hive.metastore.ObjectStore.getToken(ObjectStore.java:6412)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
at com.sun.proxy.$Proxy0.getToken(Unknown Source)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:123)
... 4 more
2016-12-16 01:30:45,017 INFO [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5643)) - Shutting down hive metastore.
ROOT CAUSE: when hive metastore start, the DelegationTokenSecretManager will maintain the same objectstore,this lead to the concurrent issue. WORKAROUND: NA RESOLUTION: this concurrency issue has been fixed in HIVE-11616 and HIVE-13090 so need to apply the patches.
... View more
Labels:
12-21-2016
05:43 PM
1 Kudo
SYMPTOM: HiveServer2 is hung and is not able to execute simple query like show tables, during the investigation, we took some jstacks and realize that the following thread is processing very slow. "HiveServer2-Handler-Pool: Thread-86129" #86129 prio=5 os_prio=0 tid=0x00007f3ad9e1a800 nid=0x1003b runnable [0x00007f3a73b0a000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap$TreeNode.find(HashMap.java:1851)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979)
at java.util.HashMap.putVal(HashMap.java:637)
at java.util.HashMap.put(HashMap.java:611)
at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:290)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$JoinerPPD.process(OpProcFactory.java:464)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10167)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:406)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:290)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
- locked <0x00000005c1e1bc18> (a java.lang.Object)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375)
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy45.executeStatementAsync(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: the compilation stage of a query in hiveserver2 is single threaded before Hive-2, so only one query is able to compile during that stage and other queries will remain in the wait state. we observe that during compilation this thread is executing expression factory for predicate pushdown processing, in which Each processor determines whether the expression is a possible candidate for predicate pushdown optimization for the given operator. the user was running a huge query at that time with which consist of more than 300+ 'case and then condition' which is taking too long. WORKAROUND: ask customer to set hive.optimize.ppd=false at session level while running this query and ask them to rewrite the sql in more optimized way. RESOLUTION: set hive.optimize.ppd=false at session level
... View more
Labels:
12-21-2016
05:24 PM
1 Kudo
SYMPTOM: hiveserver2 is too slow to respond to the simple query and taking lot of time to complete or something unresponsive. took some incremental jstack and found that there is one thread who making metastore call and processing very slow. Thread 24233: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int) @bci=8, line=116 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=170 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=141 (Compiled frame)
- oracle.net.ns.Packet.receive() @bci=157, line=311 (Compiled frame)
- oracle.net.ns.DataPacket.receive() @bci=1, line=105 (Compiled frame)
- oracle.net.ns.NetInputStream.getNextPacket() @bci=48, line=305 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[], int, int) @bci=33, line=249 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[]) @bci=5, line=171 (Compiled frame)
- oracle.net.ns.NetInputStream.read() @bci=5, line=89 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket() @bci=11, line=123 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.read() @bci=55, line=84 (Compiled frame)
- oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1() @bci=6, line=429 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.receive() @bci=16, line=397 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.doRPC() @bci=116, line=257 (Compiled frame)
- oracle.jdbc.driver.T4C8Oall.doOALL(boolean, boolean, boolean, boolean, boolean, oracle.jdbc.internal.OracleStatement$SqlKind, int, byte[], int, oracle.jdbc.driver.Accessor[], int, oracle.jdbc.driver. Accessor[], int, byte[], char[], short[], int, oracle.jdbc.driver.DBConversion, byte[], java.io.InputStream[][], byte[][][], oracle.jdbc.oracore.OracleTypeADT[][], oracle.jdbc.driver.OracleStatement, byte[], char[], short[], oracle.jdbc.driver.T4CTTIoac[], int[], int[], int[], oracle.jdbc.driver.NTFDCNRegistration, oracle.jdbc.driver.ByteArray, long[], int[], boolean) @bci=903, line=587 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean, int) @bci=780, line=225 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean) @bci=23, line=53 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe() @bci=37, line=774 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.executeMaybeDescribe() @bci=106, line=925 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout() @bci=250, line=1111 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeInternal() @bci=145, line=4798 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeQuery() @bci=18, line=4845 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery() @bci=4, line=1501 (Compiled frame)
- com.jolbox.bonecp.PreparedStatementHandle.executeQuery() @bci=68, line=174 (Compiled frame)
- org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery() @bci=4, line=375 (Compiled frame)
- org.datanucleus.store.rdbms.SQLController.executeStatementQuery(org.datanucleus.ExecutionContext, org.datanucleus.store.connection.ManagedConnection, java.lang.String, java.sql.PreparedStatement) @ bci=120, line=552 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.JoinListStore.listIterator(org.datanucleus.state.ObjectProvider, int, int) @bci=329, line=770 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.listIterator(org.datanucleus.state.ObjectProvider) @bci=4, line=93 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.iterator(org.datanucleus.state.ObjectProvider) @bci=2, line=83 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.loadFromStore() @bci=77, line=264 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.iterator() @bci=8, line=492 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToFieldSchemas(java.util.List) @bci=21, line=1199 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor, boolean) @bci=39, line=1266 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor) @bci=3, line=1281 (Compiled frame)
ROOT CAUSE: After further investigation, we found that metastore db was running outside of the cluster and shared by other applications.when there was a load on metastore database then it is taking too long to respond to the query. since this is happening in the compilation stage of the query which is single threaded hence the other incoming query which lands on Hiveserver2 will wait until this query is done with the compilation. WORKAROUND: NA RESOLUTION: Run Metastore database inside the cluster and don't share it with the other applications if you are running huge workload on HiveServer2. please check network between HiveServer2 node and MySQL node to see any bottleneck.
... View more
Labels:
12-21-2016
05:07 PM
1 Kudo
SYMPTOM: beeline connection to hiveserver2 is failing with the following exception intermittently. "The initCause method cannot be used. To set the cause of this exception, use a constructor with a Throwable[] argument." further looking at the HiveServer2 logs we saw stack trace like this ava.sql.SQLException: Could not retrieve transation read-only status server
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2259)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getProductName(MetaStoreDirectSql.java:171)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.determineDbType(MetaStoreDirectSql.java:152)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:122)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:300)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:263)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy7.setConf(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.setConf(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:56)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5798)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 35 more
Caused by: java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1086)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3972)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3943)
at com.jolbox.bonecp.ConnectionHandle.isReadOnly(ConnectionHandle.java:867)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:422)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getNucleusConnection(RDBMSStoreManager.java:1382)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2245)
... 51 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 381,109 milliseconds ago. The last packet sent successfully to the server was
381,109 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3988)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2598)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2828)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2777)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1651)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3966)
... 56 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3969)
ROOT CAUSE: After analyzing TCP DUMP we found that it's MySQL server who is dropping network packets very frequently. WORKAROUND: NA RESOLUTION: need to check and fix the network issue between HiveServer2 host and MySQL host.
... View more
Labels: