Created 11-06-2017 02:58 PM
Frequently, very frequently while I'm trying to run Spark Application this is kind of error I'm meeting with:
17/11/06 13:58:57 WARN DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1246657973-10.60.213.61-1495788390217:blk_1076301910_2561450 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2464) at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:843) 17/11/06 13:58:57 WARN DFSClient: Error Recovery for block BP-1246657973-10.60.213.61-1495788390217:blk_1076301910_2561450 in pipeline DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-dc399cb9-1705-4471-aad7-db328b1a4d94,DISK], DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK]: bad datanode DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-dc399cb9-1705-4471-aad7-db328b1a4d94,DISK] 17/11/06 13:59:01 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception java.io.IOException: All datanodes DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1227) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:999) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:506) 17/11/06 13:59:11 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception java.io.IOException: All datanodes DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1227) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:999) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:506)
Can someone please explain me why is this happening. I can't find any data or anything helpful because all nodes are fine on dashboard. All datanodes doesn't reporting any problem neither *hdfs fsck*. Any ideas, I'm really struggling 😕
Created 01-13-2018 12:14 AM
I have the same error while running Spark Streaming with flume using python:
ERROR LiveListenerBus: Listener EventLoggingListener threw an exception java.io.IOException: All datanodes DatanodeInfoWithStorage[172.17.0.2:50010,DS-8a8dcd2e-a60e-44af-ae75-4679b6341e4a,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1227) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:999) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:506)
Created 09-04-2018 06:12 PM
You seem to be hitting the open file handles limit of your user. This is a pretty common issue, and can be cleared in most cases by increasing the ulimit values (its mostly 1024 by default, easily exhaustible by multi-out jobs like yours).
You can follow this short guide to increase it: http://blog.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/ [The section "File descriptor limits"]