Created 10-13-2021 01:19 AM
We will often see the below error (either during fresh installation or after upgrade) that the jobs will intermittently fail (in my case case it failed in Hive):
2021-08-27 08:46:38,616 ERROR org.apache.curator.framework.imps.EnsembleTracker: [main-EventThread]: Invalid config event received: {server.1=********.mms.*.local:3181:4181:participant, version=0, server.3=**********.mms.*.local:3181:4181:participant, server.2=**********.mms.**.local:3181:4181:participant}
2021-08-27 08:46:43,410 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Thread-15]: Cleaning based on min open txn id: 28
2021-08-27 08:46:43,744 WARN org.apache.zookeeper.ClientCnxn: [main-SendThread(*********.mms.***.local:2181)]: Session 0x1061dda12d30069 for server ********.mms.**.local/10.4.7.17:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Packet len73388953 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:122) ~[zookeeper-3.5.5.7.1.4.37-1.jar:3.5.5.7.1.4.37-1]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:86) ~[zookeeper-3.5.5.7.1.4.37-1.jar:3.5.5.7.1.4.37-1]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363) ~[zookeeper-3.5.5.7.1.4.37-1.jar:3.5.5.7.1.4.37-1]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223) [zookeeper-3.5.5.7.1.4.37-1.jar:3.5.5.7.1.4.37-1]
Created 10-13-2021 01:55 AM
Its due to the packet length is larger than what jute.maxBuffer allows. Increasing the below property helped :
value of jute.maxbuffer to 100MB in ZooKeeper.
And appended the same (-Djute.maxbuffer=104857600) to HS2 (Hive - Configuration - 'Java Configuration Options for HiveServer2) and
HMS (Hive - Configuration - 'Java Configuration Options for Hive Metastore Server')
Created 10-13-2021 01:27 AM
Case Description : After upgrade in Prod spark job failing intermittently
ISSUE DESCRIPTION:
Prod spark job failing intermittently after the upgrade.
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:218)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:136)
TIME OF ISSUE:
21/08/26 19:44:51
CUSTOMER BUSINESS IMPACT:
Multiple issues in Prod after the upgrade.
Created 10-13-2021 01:55 AM
Its due to the packet length is larger than what jute.maxBuffer allows. Increasing the below property helped :
value of jute.maxbuffer to 100MB in ZooKeeper.
And appended the same (-Djute.maxbuffer=104857600) to HS2 (Hive - Configuration - 'Java Configuration Options for HiveServer2) and
HMS (Hive - Configuration - 'Java Configuration Options for Hive Metastore Server')