Support Questions

Find answers, ask questions, and share your expertise

java.io.IOException: Premature EOF from inputStream MissingBlock

avatar
New Contributor

Hello,

 

 I have a problem ramdonly, almost every day with many diffferent jobs, they have killed after a long time of running.

 

 It's very usual to see in the logs of MAP this.

 

java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.21.0.47:28703 remote=bda2node09.sii.cl/172.21.0.27:50010]
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.readChannelFully(PacketReceiver.java:258)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:209)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:207)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:156)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:788)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:844)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:904)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:954)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
	at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
	at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
	at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:186)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2019-07-04 02:18:43,776 WARN [main] org.apache.hadoop.hdfs.DFSClient: Could not obtain block: BP-1064157840-172.21.0.1-1532459013851:blk_1081445547_7704929 file=/user/iecv/input/resumen_detalle_otro_impuesto/iecv_detalle_otro_impuesto_dtoi/IECV_DETALLE_OTRO_IMPUESTO_DTOI_2015-07-02_03-38.txt No live nodes contain current block Block locations: DatanodeInfoWithStorage[172.21.0.6:50010,DS-ecdc373e-e4e5-4300-bd4e-049d09bf06e1,DISK] DatanodeInfoWithStorage[172.21.0.49:50010,DS-dcc8471c-1369-4aa0-8da0-31b20e36f0ca,DISK] DatanodeInfoWithStorage[172.21.0.27:50010,DS-4a71be65-7863-412e-a76d-b7b989832b67,DISK] Dead nodes:  DatanodeInfoWithStorage[172.21.0.6:50010,DS-ecdc373e-e4e5-4300-bd4e-049d09bf06e1,DISK] DatanodeInfoWithStorage[172.21.0.49:50010,DS-dcc8471c-1369-4aa0-8da0-31b20e36f0ca,DISK] DatanodeInfoWithStorage[172.21.0.27:50010,DS-4a71be65-7863-412e-a76d-b7b989832b67,DISK]. Throwing a BlockMissingException
2019-07-04 02:18:43,776 WARN [main] org.apache.hadoop.hdfs.DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1064157840-172.21.0.1-1532459013851:blk_1081445547_7704929 file=/user/iecv/input/resumen_detalle_otro_impuesto/iecv_detalle_otro_impuesto_dtoi/IECV_DETALLE_OTRO_IMPUESTO_DTOI_2015-07-02_03-38.txt
	at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1040)
	at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1023)
	at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1002)
	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)
	at org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1668)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:871)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:904)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:954)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
	at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
	at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
	at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:186)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

 And in the reduces logs we can see this WARN

 

2019-07-04 00:10:00,848 WARN [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to shuffle for fetcher#4
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:562)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:348)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)
2019-07-04 00:10:00,849 WARN [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to shuffle output of attempt_1560882595803_22947_m_013254_0 from bda1node05.sii.cl:13562
java.io.IOException: java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:566)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:348)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)
Caused by: java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:562)
... 2 more
2019-07-04 00:10:00,849 WARN [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.Fetcher: copyMapOutput failed for tasks [attempt_1560882595803_22947_m_013254_0]
.....
2019-07-04 02:21:05,554 WARN [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#3 failed to read map headernull decomp: -1, -1
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3375)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3368)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3356)
at java.io.DataInputStream.readByte(DataInputStream.java:265)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
at org.apache.hadoop.io.WritableUtils.readStringSafely(WritableUtils.java:475)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleHeader.readFields(ShuffleHeader.java:66)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:509)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:348)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)

 

 

Any ideas, what are the posible causes of these kind of issues?.

 

Thanks,

CF

1 REPLY 1

avatar
Super Guru
Hi @Cristian.fuentesd,

50010 is the DataNode port:
remote=xxxxxxx.xxx.xx/xxx.xx.x.xxx:50010

Have you checked the DataNode log on host xxxxxxx.xxx.xx? (You probably want to mask your host names).

Cheers
Eric