Support Questions
Find answers, ask questions, and share your expertise

All datanodes are bad. Aborting...). Closing file

Hi ,

Am getting the error stating that All datanodes are bad. Aborting...). Closing file
 
 
 
2014-09-02 18:03:04,276 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: All datanodes 172.16.0.106:50010 are bad. Aborting...
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3179)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2672)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2876)
2014-09-02 18:03:09,277 WARN hdfs.BucketWriter: Caught IOException writing to HDFSWriter (All datanodes 172.16.0.106:50010 are bad. Aborting...). Closing file (hdfs://rta01.prod.hs18.lan:9000/logs/prod/jboss/2014/09/01//web07.prod.hs18.lan.jboss2.1409556349351.tmp) and rethrowing exception.
2014-09-02 18:03:09,277 WARN hdfs.BucketWriter: Caught IOException while closing file (hdfs://rta01.prod.hs18.lan:9000/logs/prod/jboss/2014/09/01//web07.prod.hs18.lan.jboss2.1409556349351.tmp). Exception follows.
java.io.IOException: DFSOutputStream is closed
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3754)
at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:345)
at org.apache.flume.sink.hdfs.BucketWriter.access$500(BucketWriter.java:53)
at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:310)
at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:308)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:308)
at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:257)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:382)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
4 REPLIES 4

What steps did you take prior to this error occurring? Is it happening with any access to HDFS is done or for specific ones?
Regards,
Gautam Gopalakrishnan

Am phasing this problem when flume-ng agent writes the data from web-server to HDFS(Sink)

Explorer

Were you able to solve this issue? if yes how?

 

Thanks

RK

Rising Star

In the OP's case, it looks like the file has replication factor = 1. In that case if the DataNode was shutdown (crashed or a normal restart) then you'd see it, because of the replication factor. If that's your setup please consider making replication factor 2 or more.