Support Questions
Find answers, ask questions, and share your expertise

Hive query failing after adding new metastore

Solved Go to solution

Hive query failing after adding new metastore

Explorer

I'm having somes issues when i run some hive queries, i had one HiveMS on node "monde2" then i added a new HiveMS and HiveServer2 on "mnode4", the issue appears only when the metastore when "mnode4" is selected.

 

Example :

Running this query on MS mnode2 finish without any issue

hive -hiveconf hive.root.logger=ERROR,console -hiveconf mapreduce.job.reduces=200 -hiveconf hive.execution.engine=mr -hiveconf hive.metastore.uris="thrift://mnode2.<fqdn>:9083" -e "SELECT 'eu' region, CONCAT('eu_', clecommande) order_id, * FROM replication_prd.eu_facturation_commande;"

 But if i change the MS to mnode4, the query fails with some errors

hive -hiveconf hive.root.logger=ERROR,console -hiveconf mapreduce.job.reduces=200 -hiveconf hive.execution.engine=mr -hiveconf hive.metastore.uris="thrift://mnode4.<fqdn>:9083" -e "SELECT 'eu' region, CONCAT('eu_', clecommande) order_id, * FROM replication_prd.eu_facturation_commande;"
2021-07-08 18:27:21,027 WARN  [main] conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
2021-07-08 18:27:21,028 WARN  [main] conf.HiveConf: HiveConf of name hive.mapred.supports.subdirectories does not exist
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.

Logging initialized using configuration in file:/etc/hive/2.6.5.1100-53/0/hive-log4j.properties
Query ID = ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
21/07/08 18:27:27 [main]: ERROR mr.ExecDriver: yarn
Starting Job = job_1625042589026_50921, Tracking URL = http://mnode1.<fqdn>:8088/proxy/application_1625042589026_50921/
Kill Command = /usr/hdp/2.6.5.1100-53/hadoop/bin/hadoop job  -kill job_1625042589026_50921
Hadoop job information for Stage-1: number of mappers: 305; number of reducers: 0
2021-07-08 18:27:41,636 Stage-1 map = 0%,  reduce = 0%
2021-07-08 18:28:04,177 Stage-1 map = 1%,  reduce = 0%, Cumulative CPU 1187.97 sec
2021-07-08 18:28:05,243 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 1601.78 sec
...
2021-07-08 18:29:28,155 Stage-1 map = 97%,  reduce = 0%, Cumulative CPU 16950.27 sec
2021-07-08 18:29:29,211 Stage-1 map = 98%,  reduce = 0%, Cumulative CPU 17040.87 sec
2021-07-08 18:29:31,339 Stage-1 map = 99%,  reduce = 0%, Cumulative CPU 17136.57 sec
21/07/08 18:29:55 [Heartbeater-0]: ERROR lockmgr.DbTxnManager: Unable to find lock lockid:24246126
21/07/08 18:29:55 [Heartbeater-0]: ERROR lockmgr.DbTxnManager: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<REALM> (auth:KERBEROS): null
java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more
org.apache.hadoop.hive.ql.lockmgr.LockException: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn>                            (auth:KERBEROS): null
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:714)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        ... 7 more
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more
Ended Job = job_1625042589026_50921 with exception 'org.apache.hadoop.hive.ql.lockmgr.LockException(Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null)'
21/07/08 18:29:56 [main]: ERROR exec.Task: Ended Job = job_1625042589026_50921 with exception 'org.apache.hadoop.hive.ql.lockmgr.LockException(Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca                           79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null)'
org.apache.hadoop.hive.ql.lockmgr.LockException: Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:714)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:699)
        ... 7 more
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock lockid:24246126 could be found, may have timed out
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:484)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater$1.run(DbTxnManager.java:702)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        ... 8 more
Caused by: NoSuchLockException(message:No such lock: lockid:24246126)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result$heartbeat_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$heartbeat_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_heartbeat(ThriftHiveMetastore.java:4253)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.heartbeat(ThriftHiveMetastore.java:4240)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:2076)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
        at com.sun.proxy.$Proxy5.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:481)
        ... 12 more

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<REALM> (auth:KERBEROS): null
21/07/08 18:29:59 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed trying to heartbeat queryId=ksad_20210708182723_0e24aad0-ca79-412e-9e93-8ae5a17b26b3, currentUser: ksad@<fqdn> (auth:KERBEROS): null

Everything works well before adding the metastore

HDP : 2.6.5

Ambari : 2.6.2.2

Hive : 1.2.1000

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive query failing after adding new metastore

Explorer

hi @Shelton 

Thanks for the documentation but problem was a time issue

first HMS was in UTC and the new one was CEST

Changing mnode2 to UTC fixed the issue

View solution in original post

2 REPLIES 2

Re: Hive query failing after adding new metastore

Mentor

@enirys 
That's correct to successfully set up an HMS  HA you  MUST ensure the metadata DB should have followed the steps mention in this official document:
Configuring High Availability for the Hive Metastore

High Availability for Hive Metastore

That's should help you sort of the stale metadata issue

Re: Hive query failing after adding new metastore

Explorer

hi @Shelton 

Thanks for the documentation but problem was a time issue

first HMS was in UTC and the new one was CEST

Changing mnode2 to UTC fixed the issue

View solution in original post