Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PutHiveStreaming + Failed to create HiveWriter Error

PutHiveStreaming + Failed to create HiveWriter Error

Guru

I'm trying to write the Hive using NiFi's PutHiveStreaming processor. The connection seems to be successful but fails when trying to write the actual values. I have included my NiFi configuration, NiFi output, HiveMetastore output, and Table DDL. Any input would be greatly appreciated. This is running on Azure and NiFi is running on the same nodes as HDP. I have validated the network configuration is correct.

HDP Version: 2.5.0.0-1237

Hive Metastore Error from Log:

2016-08-25 18:55:57,286 ERROR [pool-5-thread-57]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(195)) - java.lang.IllegalStateException: Unexpected DataOperationType: UNSET agentInfo=Unknown txnid:6701
        at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:938)
        at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:814)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5751)
        at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:139)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:97)
        at com.sun.proxy.$Proxy12.lock(Unknown Source)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:11860)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:11844)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

NiFi Master Branch Build Log: 8-25-2016

2016-08-25 19:05:32,396 WARN [Finalizer] o.a.thrift.transport.TIOStreamTransport Error closing output stream.
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) ~[na:1.8.0_66]
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_66]
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.8.0_66]
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[na:1.8.0_66]
        at java.io.FilterOutputStream.close(FilterOutputStream.java:158) ~[na:1.8.0_66]
        at org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.thrift.transport.TSocket.close(TSocket.java:196) [libthrift-0.9.2.jar:0.9.2]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:500) [hive-metastore-1.2.1.jar:1.2.1]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.tearDown(HiveClientCache.java:403) [hive-hcatalog-core-1.2.1.jar:1.2.1]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.finalize(HiveClientCache.java:418) [hive-hcatalog-core-1.2.1.jar:1.2.1]
        at java.lang.System$2.invokeFinalize(System.java:1270) [na:1.8.0_66]
        at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) [na:1.8.0_66]
        at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [na:1.8.0_66]
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) [na:1.8.0_66]
2016-08-25 19:05:32,396 ERROR [Finalizer] hive.metastore Unable to shutdown local metastore client
org.apache.thrift.transport.TTransportException: java.net.SocketException: Socket closed
        at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) ~[libthrift-0.9.2.jar:0.9.2]
        at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436) ~[libfb303-0.9.2.jar:na]
        at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) ~[libfb303-0.9.2.jar:na]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:492) ~[hive-metastore-1.2.1.jar:1.2.1]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.tearDown(HiveClientCache.java:403) [hive-hcatalog-core-1.2.1.jar:1.2.1]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.finalize(HiveClientCache.java:418) [hive-hcatalog-core-1.2.1.jar:1.2.1]
        at java.lang.System$2.invokeFinalize(System.java:1270) [na:1.8.0_66]
        at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) [na:1.8.0_66]
        at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [na:1.8.0_66]
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) [na:1.8.0_66]
Caused by: java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) ~[na:1.8.0_66]
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_66]
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.8.0_66]
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[na:1.8.0_66]
        at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159) ~[libthrift-0.9.2.jar:0.9.2]
        ... 10 common frames omitted

Hive DDL:

CREATE TABLE IF NOT EXISTS XXX.REDACTED(
  RPC STRING,
  REVIEW_TEXT STRING
)
CLUSTERED BY (RPC)INTO 3 BUCKETS
ROW FORMAT DELIMITED
STORED AS ORC 
TBLPROPERTIES('transactional'='true');

6973-screen-shot-2016-08-25-at-31058-pm.png

4 REPLIES 4

Re: PutHiveStreaming + Failed to create HiveWriter Error

This might be a Thrift mismatch (either in the Thrift versions themselves or in the Hive protocol over Thrift), you may need to build NiFi with the Hortonworks version of the Hive/Hadoop JARs (using the -Phortonworks profile and overriding the hadoop.version and hive.version properties)

Re: PutHiveStreaming + Failed to create HiveWriter Error

Guru

@Matt Burgess that makes complete sense to me. It seems like the Hortonworks repo referenced by the -Phortonworks Maven profile does not yet have the Apache jars for Hadoop and Hive that are present in HDP 2.5.0.0-1237. Seems like I'm probably going to have to wait until HDP 2.5.X becomes GA and then give this approach a try. Have you tried this with the HDP 2.5 TechPreview Sandbox before by chance?

Re: PutHiveStreaming + Failed to create HiveWriter Error

Not sure what version(s) HDP 2.5 TP Sandbox are using, but we've got "recent" published versions like Hadoop 2.7.x: http://repo.hortonworks.com/content/repositories/releases/org/apache/hadoop/hadoop-client/ that might get you close (before GA)

Re: PutHiveStreaming + Failed to create HiveWriter Error

Contributor

Hive Streaming uses Inserts rather than Loading data, check if inserts are working and ACID is turned on.

can you show me a redacted sample of the data that I can test locally?

Don't have an account?
Coming from Hortonworks? Activate your account here