Created on 06-21-2022 03:24 AM - last edited on 06-21-2022 07:23 AM by VidyaSargur
NIFI version 1.16.1
nifi-hive3-nar-1.16.2
While loading data to hive tables using PutHive3Streaming, some of the tables loads are getting failed with errors. I tried to change commit size, it didn't help. out of 50 hive table loads only 5 gets failed with this error and this is repeatable. Record count for these tables are less than 1Million.
2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
org.apache.nifi.processors.hive.PutHive3Streaming$ShouldRetryException: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:512)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662)
at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.streaming.TransactionError: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:877)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commit(HiveStreamingConnection.java:841)
at org.apache.hive.streaming.HiveStreamingConnection.commitTransaction(HiveStreamingConnection.java:513)
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:499)
... 16 common frames omitted
Caused by: org.apache.hadoop.hive.metastore.api.TxnAbortedException: Transaction txnid:10346416 already aborted
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_commit_txn(ThriftHiveMetastore.java:5192)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.commit_txn(ThriftHiveMetastore.java:5179)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.commitTxn(HiveMetaStoreClient.java:2491)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
at com.sun.proxy.$Proxy249.commitTxn(Unknown Source)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:859)
... 19 common frames omitted
2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Failed to abort Hive Streaming transaction { metaStoreUri: thrift://vsgcnredhad12.in.reach.com:9083,thrift://vsgcnredhad13.in.reach.com:9083, database: tigfin_nifi, table: t_ap_invoice_lines_all } due to exception
org.apache.hive.streaming.StreamingException: Transaction state is not OPEN. Missing beginTransaction?
at org.apache.hive.streaming.HiveStreamingConnection.checkState(HiveStreamingConnection.java:500)
at org.apache.hive.streaming.HiveStreamingConnection.abortTransaction(HiveStreamingConnection.java:519)
at org.apache.nifi.processors.hive.PutHive3Streaming.abortConnection(PutHive3Streaming.java:652)
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:559)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662)
at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Created 06-21-2022 07:51 AM
I would check Hive metastore logs to get more details on aborted transactions with respect to transaction ids since transactions are aborted at the Hive end and not the NiFi end.
Created 06-21-2022 07:51 AM
I would check Hive metastore logs to get more details on aborted transactions with respect to transaction ids since transactions are aborted at the Hive end and not the NiFi end.
Created 06-23-2022 03:50 AM
@Althotta, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,