Support Questions
Find answers, ask questions, and share your expertise

Hive query failing after adding new metastore

Contributor

I'm having somes issues when i run some hive queries, i had one HiveMS on node "monde2" then i added a new HiveMS and HiveServer2 on "mnode4", the issue appears only when the metastore when "mnode4" is selected.

 

Example :

Running this query on MS mnode2 finish without any issue

hive -hiveconf hive.root.logger=ERROR,console -hiveconf mapreduce.job.reduces=200 -hiveconf hive.execution.engine=mr -hiveconf hive.metastore.uris="thrift://mnode2.<fqdn>:9083" -e "SELECT 'eu' region, CONCAT('eu_', clecommande) order_id, * FROM replication_prd.eu_facturation_commande;"

 But if i change the MS to mnode4, the query fails with some errors

hive -hiveconf hive.root.logger=ERROR,console -hiveconf mapreduce.job.reduces=200 -hiveconf hive.execution.engine=mr -hiveconf hive.metastore.uris="thrift://mnode4.<fqdn>:9083" -e "SELECT 'eu' region, CONCAT('eu_', clecommande) order_id, * FROM replication_prd.eu_facturation_commande;"
2021-07-08 18:27:21,027 WARN  [main] conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
2021-07-08 18:27:21,028 WARN  [main] conf.HiveConf: HiveConf of name hive.mapred.supports.subdirectories does not exist
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.

Logging initialized using configuration in file:/etc/hive/2.6.5.1100-53/0/hive-log4j.properties
Query ID = ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
21/07/08 18:27:27 [main]: ERROR mr.ExecDriver: yarn
Starting Job = job_1625042589026_50921, Tracking URL = http://mnode1.<fqdn>:8088/proxy/application_1625042589026_50921/
Kill Command = /usr/hdp/2.6.5.1100-53/hadoop/bin/hadoop job  -kill job_1625042589026_50921
Hadoop job information for Stage-1: number of mappers: 305; number of reducers: 0
2021-07-08 18:27:41,636 Stage-1 map = 0%,  reduce = 0%
2021-07-08 18:28:04,177 Stage-1 map = 1%,  reduce = 0%, Cumulative CPU 1187.97 sec
2021-07-08 18:28:05,243 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 1601.78 sec
...
2021-07-08 18:29:28,155 Stage-1 map = 97%,  reduce = 0%, Cumulative CPU 16950.27 sec
2021-07-08 18:29:29,211 Stage-1 map = 98%,  reduce = 0%, Cumulative CPU 17040.87 sec
2021-07-08 18:29:31,339 Stage-1 map = 99%,  reduce = 0%, Cumulative CPU 17136.57 sec
21/07/08 18:29:55 [Heartbeater-0]: ERROR lockmgr.DbTxnManager: Unable to find lock lockid:24246126
21/07/08 18:29:55 [Heartbeater-0]: ERROR lockmgr.DbTxnManager: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<REALM> (auth:KERBEROS): null
java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more
org.apache.hadoop.hive.ql.lockmgr.LockException: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn>                            (auth:KERBEROS): null
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:714)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        ... 7 more
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more
Ended Job = job_1625042589026_50921 with exception 'org.apache.hadoop.hive.ql.lockmgr.LockException(Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null)'
21/07/08 18:29:56 [main]: ERROR exec.Task: Ended Job = job_1625042589026_50921 with exception 'org.apache.hadoop.hive.ql.lockmgr.LockException(Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca                           79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null)'
org.apache.hadoop.hive.ql.lockmgr.LockException: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:714)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        ... 7 more
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<REALM> (auth:KERBEROS): null
21/07/08 18:29:59 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null

Everything works well before adding the metastore

HDP : 2.6.5

Ambari : 2.6.2.2

Hive : 1.2.1000

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive query failing after adding new metastore

Contributor

hi @Shelton 

Thanks for the documentation but problem was a time issue

first HMS was in UTC and the new one was CEST

Changing mnode2 to UTC fixed the issue

View solution in original post

2 REPLIES 2

Re: Hive query failing after adding new metastore

Mentor

@enirys 
That's correct to successfully set up an HMS  HA you  MUST ensure the metadata DB should have followed the steps mention in this official document:
Configuring High Availability for the Hive Metastore

High Availability for Hive Metastore

That's should help you sort of the stale metadata issue

Re: Hive query failing after adding new metastore

Contributor

hi @Shelton 

Thanks for the documentation but problem was a time issue

first HMS was in UTC and the new one was CEST

Changing mnode2 to UTC fixed the issue

View solution in original post