Created 03-13-2017 02:10 PM
Anyone have further info that can be provided regarding this error?
2017-03-10 15:18:44,317 INFO datanode.DataNode (BlockReceiver.java:receiveBlock(935)) - Exception for BP-282268147-124.121.209.38-1430171074465:blk_1117945085_44312886 java.io.IOException: Incorrect value for packet payload size: 2147483128 at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:502) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:896) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:805) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:745)
This comes from PacketReceiver.java on the HDFS Data Node. I think the value of MAX_PACKET_SIZE is hard-coded to 16M in that code, but somehow I have a client operation which is resulting in a payload size of a hair under 2GB. Not sure where to look for settings that would control this behavior.
The client gets a connection reset by peer:
2017-03-10 15:18:44.317 -0600 WARN DFSOutputStream$DataStreamer$ResponseProcessor - DFSOutputStream ResponseProcessor exception for block BP-282268147-124.121.209.38-1430171074465:blk_1117945085_44312886 java.io.IOException: Connection reset by peer 2017-03-10 15:18:45.020 -0600 WARN DFSOutputStream$DataStreamer - Error Recovery for block BP-282268147-164.121.209.38-1430171074465:blk_1117945085_44312886 in pipeline DatanodeInfoWithStorage[164.121.209.43:50010,DS-8de5e011-72e8-4097-bbf9-5467b1542f22,DISK], DatanodeInfoWithStorage[164.121.209.30:50010,DS-4c9dd8a3-07ee-4e45-bef1-73f6957c1383,DISK]: bad datanode DatanodeInfoWithStorage[164.121.209.43:50010,DS-8de5e011-72e8-4097-bbf9-5467b1542f22,DISK]
Created 03-13-2017 06:34 PM
Can you check what is the "io.file.buffer.size" is set to here? You may need to tweak it to set this to below what the "MAX_PACKET_SIZE" is set to .
Referencing a great blog post here (http://johnjianfang.blogspot.com/2014/10/hadoop-two-file-buffer-size.html)
For example, take a look at the BlockSender in HDFS.class BlockSender implements java.io.Closeable { /** * Minimum buffer used while sending data to clients. Used only if * transferTo() is enabled. 64KB is not that large. It could be larger, but * not sure if there will be much more improvement. */ private static final int MIN_BUFFER_WITH_TRANSFERTO = 64*1024; private static final int TRANSFERTO_BUFFER_SIZE = Math.max( HdfsConstants.IO_FILE_BUFFER_SIZE, MIN_BUFFER_WITH_TRANSFERTO); } The BlockSender uses "io.file.buffer.size" as the transfer buffer size. If this parameter is not defined, the default buffer size 64KB is used. The above explains why most hadoop IOs were either 4K or 64K chunks in my friend's cluster since he did not tune the cluster. To achieve a better performance, we should tune "io.file.buffer.size" to a much bigger value, for example, up to 16MB. The upper limit is set by the MAX_PACKET_SIZEin org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.
Created 03-13-2017 06:49 PM
I have io.file.buffer.size set to 128K. MAX_PACKET_SIZE, I believe, is 16M.
Created 04-19-2017 07:12 PM
It is likely another instance of HFDS-11608 where the block size is set too big (> 2GB). The overflow issue was recently fixed by https://issues.apache.org/jira/browse/HDFS-11608.