Support Questions
Find answers, ask questions, and share your expertise

NIFI : PutHive3Streaming Transaction state is not OPEN. Missing beginTransaction

Explorer

NIFI version 1.16.1

nifi-hive3-nar-1.16.2

 

While loading data to hive tables using PutHive3Streaming, some of the tables loads are getting failed with errors. I tried to change commit size, it didn't help.  out of 50 hive table loads only 5 gets failed with this error and this is repeatable. Record count for these tables are less than 1Million.

 

2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
org.apache.nifi.processors.hive.PutHive3Streaming$ShouldRetryException: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:512)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662)
at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.streaming.TransactionError: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:877)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commit(HiveStreamingConnection.java:841)
at org.apache.hive.streaming.HiveStreamingConnection.commitTransaction(HiveStreamingConnection.java:513)
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:499)
... 16 common frames omitted
Caused by: org.apache.hadoop.hive.metastore.api.TxnAbortedException: Transaction txnid:10346416 already aborted
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_commit_txn(ThriftHiveMetastore.java:5192)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.commit_txn(ThriftHiveMetastore.java:5179)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.commitTxn(HiveMetaStoreClient.java:2491)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
at com.sun.proxy.$Proxy249.commitTxn(Unknown Source)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:859)
... 19 common frames omitted
2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Failed to abort Hive Streaming transaction { metaStoreUri: thrift://vsgcnredhad12.in.reach.com:9083,thrift://vsgcnredhad13.in.reach.com:9083, database: tigfin_nifi, table: t_ap_invoice_lines_all } due to exception
org.apache.hive.streaming.StreamingException: Transaction state is not OPEN. Missing beginTransaction?
at org.apache.hive.streaming.HiveStreamingConnection.checkState(HiveStreamingConnection.java:500)
at org.apache.hive.streaming.HiveStreamingConnection.abortTransaction(HiveStreamingConnection.java:519)
at org.apache.nifi.processors.hive.PutHive3Streaming.abortConnection(PutHive3Streaming.java:652)
at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:559)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662)
at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

1 ACCEPTED SOLUTION

Expert Contributor

I would check Hive metastore logs to get more details on aborted transactions with respect to transaction ids since transactions are aborted at the Hive end and not the NiFi end. 

View solution in original post

2 REPLIES 2

Expert Contributor

I would check Hive metastore logs to get more details on aborted transactions with respect to transaction ids since transactions are aborted at the Hive end and not the NiFi end. 

Community Manager

@Althotta, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.