Support Questions
Find answers, ask questions, and share your expertise

org.apache.hive.hcatalog.streaming.StreamingIOFailure while running HiveStreaming outside of edge node.

org.apache.hive.hcatalog.streaming.StreamingIOFailure while running HiveStreaming outside of edge node.

we are trying to run HiveStream outside edge node; the name nodes are in HA, we are able to get table meta data during connection, but while trying to commit transaction it is failing because of host is not reachable , the error is "org.apache.hive.hcatalog.streaming.StreamingIOFailure: Unable to flush recordUpdater"

The program works when run from the hadoop edge node, but it fails with the above exception when run from any other machine.

The root cause is shown as below

Caused by: java.io.IOException: DataStreamer Exception: at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:577) Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1601)

The logs indicate it is able to connect to hive metastore nut when flushing recordUpdater fails with the exception.

Code

val hiveEP: HiveEndPoint = new HiveEndPoint(conf.getVar(ConfVars.METASTOREURIS), dbName, tableName, partitionVals.asJava) val conn: StreamingConnection = hiveEP.newConnection(true, conf, "HiveStreamProcessor") System.err.println("Got new connection") val jsonWriter: StrictJsonWriter = new StrictJsonWriter(hiveEP, conf, conn) val start = System.currentTimeMillis()

Is there anything specific we need to do to make it work from outside edgenode ?