Support Questions

bfarwell · ‎05-10-2016

Hi. I'm new to hadoop and Cloudera. I have a 5 node cluster with 3 data nodes. I have a third party client program that is opening hdfs files and sending them data as it arrive in a stream. On a timer, every 10 min, the client closes the files and opens new ones for writing. Before the close can happen, the datanode socket connection times out with this error:

2016-05-10 14:17:20,165 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1298278955-172.31.1.79-1461125109305:blk_1073807048_66356
java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.31.15.196:50010 remote=/172.31.1.81:57017]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:500)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:894)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:794)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)

Question: How do I change the 60000 milis setting to a larger value?

I've tried dfs.datanode.socket.write.timeout and dfs.socket.timeout in hdfs config through Cloudera admin with config redeploy and cluster restart. I've also tried adding these and dfs.client.socket-timeout in hdfs-client.xml on the client side. Nothing seems to affect the used value.

Thanks in advance.

-Bruce

vik11 · ‎07-25-2017

Hai,

Did you get any resolution for this ? Even am facing the same problem now.

Thanks

mystreet · ‎11-08-2018

pls paster your solution i m facing the same issue

mystreet · ‎11-08-2018

I am facing the same issue

ramanhopes · ‎11-22-2018

We cannot be sure of the reasons for this message with the snippet that you have provided. If you notice, the connection is being successfuly set but there is not response from DN.

~~~

java.nio.channels.SocketChannel[connected local=/172.31.15.196:50010 remote=/172.31.1.81:57017]

~~~

It can happen due to various reasons, like, the pipeline is interrupted, there are network congestions at play, the DN disk is not performing well, DN host OS is having issues like kernel soft lockups or just that the DN is too heavily loaded to respond back. You'd have to dig in more into the logs and look for more information.

See the messages logged before the exception you're getting in the DN logs.

Cloudera Community

Support Questions

Datanode socket timeout setting

Disconnecting from node due to socket connection t...

Hadoop failed on socket timeout exception: java.ne...

nifi socket timeout importing template

In Apache NIFI requests are queued and showing so...

Datanode Service Error Related to NFS Mount Issue

Garbage Collection Pauses in Namenode and Datanode

Error: com.teradata.connector.common.exception.Con...

Parcel distribution fails with socket timeout exce...

Hue 3.11 Hive Query editor - Not retrying thrift c...

How to set a processor to DEBUG when on Cloudera D...