Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Broken Pipes in NiFi

avatar
Master Guru

This is the latests HDF 2.x.

Sometimes it happens in ExecuteStreamCommand. If I have a flow running for a few days continuously there will be a few a day.

ion: Broken pipe: java.io.IOException: Broken pipe
2017-01-17 18:42:46,932 ERROR [Thread-617894] o.a.n.p.standard.ExecuteStreamCommand
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method) ~[na:1.8.0_77]
at java.io.FileOutputStream.write(FileOutputStream.java:326) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) ~[na:1.8.0_77]
at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:36) ~[nifi-utils-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.processors.standard.ExecuteStreamCommand$2.run(ExecuteStreamCommand.java:489) ~[nifi-standard-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]

PutHiveQL will happen usually any time it first starts up

2017-01-17 18:42:37,563 ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hive.PutHiveQL PutHiveQL[id=71b732e9-f140-109c-a315-47f0af695760] Failed to update Hive for StandardFlowFileRecord[uuid=d95fadbd-d349-4efe-88da-cb76c3f2aca8,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1484678529444-3613, container=default, section=541], offset=596859, length=425],offset=0,name=1653249247047835.json.orc,size=425] due to java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe; it is possible that retrying the operation will succeed, so routing to retry: java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
2017-01-17 18:42:37,568 ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hive.PutHiveQL
java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:305) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:238) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:98) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.nifi.processors.hive.PutHiveQL.onTrigger(PutHiveQL.java:161) ~[nifi-hive-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TTransport.write(TTransport.java:107) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslTransport.writeLength(TSaslTransport.java:391) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:499) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:223) ~[hive-service-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:215) ~[hive-service-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at sun.reflect.GeneratedMethodAccessor1087.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_77]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_77]
at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1363) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at com.sun.proxy.$Proxy211.ExecuteStatement(Unknown Source) ~[na:na]
at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:296) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
... 17 common frames omitted
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.8.0_77]
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) ~[na:1.8.0_77]
at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) ~[na:1.8.0_77]
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
... 31 common frames omitted
1 ACCEPTED SOLUTION

avatar
Master Guru

In the HiveConnectionPool you specified for PutHiveQL, did you supply a Validation Query? This query is performed on a retrieved connection to ensure that it is legitimate (the error you describe often happens with idle connections in Hive). If the query fails, the connection will be recycled and a new one will be retrieved.

I haven't tried this sample query but something like "SELECT 1 from myExistingTable" might work. The basic idea is that the query takes negligible time but still opens a connection.

View solution in original post

9 REPLIES 9

avatar
Master Guru

In the HiveConnectionPool you specified for PutHiveQL, did you supply a Validation Query? This query is performed on a retrieved connection to ensure that it is legitimate (the error you describe often happens with idle connections in Hive). If the query fails, the connection will be recycled and a new one will be retrieved.

I haven't tried this sample query but something like "SELECT 1 from myExistingTable" might work. The basic idea is that the query takes negligible time but still opens a connection.

avatar
Master Guru

I will add a validation query

avatar
Master Guru

getting errors on the validation query

avatar
Master Guru

Was the problem alleviated after the bad connections were returned?

avatar
Master Guru

after restart it was ok

avatar
Master Guru

this also happens in executestreamcommand when calling a shell script

avatar
Contributor

I am having this same issue on ExecuteStreamCommand. Do you have recommendation on how to fix it?

avatar
Contributor

Not sure if this is considered a fix...but in my situation I think NiFi was accessing files on the script I called to quickly on the Linux box. There was no issue with output files data being incorrect so I just turned off errors on the ExecuteStreamCommand processor. Running without issue as far as I can tell.

avatar
Contributor

@Timothy Spann @Matt Burgess

Thanks for this very helpful query.

in our case, the validation query seemed to have worked for the flow to function properly.

For safe measures, we also put a retry queue on the PutHiveQL Processor.