<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Getting &amp;quot; IOException: Failed to replace a bad datanode&amp;quot; while executing MapReduce Jobs in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166447#M24995</link>
    <description>&lt;P&gt;I'm trying to execute a MapReduce streaming job in a 10 node Hadoop cluster(HDP2.2). There are 5 datanodes in the cluster. When the reduce phase reaches almost 100% completion, I'm getting the below error in client logs:&lt;/P&gt;&lt;PRE&gt;Error: java.io.IOException: Failed to replace a bad
datanode on the existing pipeline due to no more good datanodes being available
to try. (Nodes: current=[x.x.x.x:50010], original=[x.x.x.x:50010]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration&lt;/PRE&gt;&lt;P&gt;The data node on which the jobs were executing contained below logs:&lt;/P&gt;&lt;PRE&gt; INFO datanode.DataNode (BlockReceiver.java:run(1222)) - PacketResponder:
BP-203711345-10.254.65.246-1444744156994:blk_1077645089_3914844,
type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available              
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)

java.io.IOException: Premature EOF from inputStream              
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)

2016-04-10 08:12:14,477 WARN  datanode.DataNode
(BlockReceiver.java:run(1256)) - IOException in BlockReceiver.run(): 

java.io.IOException: Connection reset by peer

016-04-10 08:13:22,431 INFO  datanode.DataNode
(BlockReceiver.java:receiveBlock(816)) - Exception for
BP-203711345-x.x.x.x -1444744156994:blk_1077645082_3914836

java.net.SocketTimeoutException: 60000 millis timeout while
waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/XX.XXX.XX.XX:50010 remote=/XX.XXX.XX.XXX:57649]

&lt;/PRE&gt;&lt;P&gt;The NameNode logs contained the below warning:&lt;/P&gt;&lt;PRE&gt; WARN blockmanagement.BlockPlacementPolicy
(BlockPlacementPolicyDefault.java:chooseTarget(383)) - Failed to place enough
replicas, still in need of 1 to reach 2 (unavailableStorages=[DISK],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more
information, please enable DEBUG log level on
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&lt;/PRE&gt;&lt;P&gt;I had tried setting the below parameters in hdfs-site.xml&lt;/P&gt;&lt;PRE&gt;dfs.datanode.handler.count =10
dfs.client.file-block-storage-locations.num-threads = 10
dfs.datanode.socket.write.timeout=20000 
&lt;/PRE&gt;&lt;P&gt;But still the error persists. Kindly suggest a solution.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Tue, 12 Apr 2016 19:48:57 GMT</pubDate>
    <dc:creator>phoncy_joseph</dc:creator>
    <dc:date>2016-04-12T19:48:57Z</dc:date>
    <item>
      <title>Getting " IOException: Failed to replace a bad datanode" while executing MapReduce Jobs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166447#M24995</link>
      <description>&lt;P&gt;I'm trying to execute a MapReduce streaming job in a 10 node Hadoop cluster(HDP2.2). There are 5 datanodes in the cluster. When the reduce phase reaches almost 100% completion, I'm getting the below error in client logs:&lt;/P&gt;&lt;PRE&gt;Error: java.io.IOException: Failed to replace a bad
datanode on the existing pipeline due to no more good datanodes being available
to try. (Nodes: current=[x.x.x.x:50010], original=[x.x.x.x:50010]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration&lt;/PRE&gt;&lt;P&gt;The data node on which the jobs were executing contained below logs:&lt;/P&gt;&lt;PRE&gt; INFO datanode.DataNode (BlockReceiver.java:run(1222)) - PacketResponder:
BP-203711345-10.254.65.246-1444744156994:blk_1077645089_3914844,
type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available              
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)

java.io.IOException: Premature EOF from inputStream              
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)

2016-04-10 08:12:14,477 WARN  datanode.DataNode
(BlockReceiver.java:run(1256)) - IOException in BlockReceiver.run(): 

java.io.IOException: Connection reset by peer

016-04-10 08:13:22,431 INFO  datanode.DataNode
(BlockReceiver.java:receiveBlock(816)) - Exception for
BP-203711345-x.x.x.x -1444744156994:blk_1077645082_3914836

java.net.SocketTimeoutException: 60000 millis timeout while
waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/XX.XXX.XX.XX:50010 remote=/XX.XXX.XX.XXX:57649]

&lt;/PRE&gt;&lt;P&gt;The NameNode logs contained the below warning:&lt;/P&gt;&lt;PRE&gt; WARN blockmanagement.BlockPlacementPolicy
(BlockPlacementPolicyDefault.java:chooseTarget(383)) - Failed to place enough
replicas, still in need of 1 to reach 2 (unavailableStorages=[DISK],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more
information, please enable DEBUG log level on
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&lt;/PRE&gt;&lt;P&gt;I had tried setting the below parameters in hdfs-site.xml&lt;/P&gt;&lt;PRE&gt;dfs.datanode.handler.count =10
dfs.client.file-block-storage-locations.num-threads = 10
dfs.datanode.socket.write.timeout=20000 
&lt;/PRE&gt;&lt;P&gt;But still the error persists. Kindly suggest a solution.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 12 Apr 2016 19:48:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166447#M24995</guid>
      <dc:creator>phoncy_joseph</dc:creator>
      <dc:date>2016-04-12T19:48:57Z</dc:date>
    </item>
    <item>
      <title>Re: Getting " IOException: Failed to replace a bad datanode" while executing MapReduce Jobs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166448#M24996</link>
      <description>&lt;P&gt;Are all of your data nodes healthy and have enough available disk space? For some reasons writing block to one of them fails and beacuse your replication factor is 2 and replace-datanode-on-failure.policy=DEFAULT, NN will not try another DN and write fails. So, first make sure your DNs are all right. If they look good then try to set&lt;/P&gt;&lt;PRE&gt;dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
dfs.client.block.write.replace-datanode-on-failure.best-effort=true&lt;/PRE&gt;&lt;P&gt;The second one works only in new versions of Hadoop (HDP-2.2.6 or later). See &lt;A href="https://community.hortonworks.com/articles/16144/write-or-append-failures-in-very-small-clusters-un.html"&gt;this&lt;/A&gt; and &lt;A href="http://blog.cloudera.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/"&gt;this&lt;/A&gt; for details.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Apr 2016 20:48:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166448#M24996</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-04-12T20:48:04Z</dc:date>
    </item>
    <item>
      <title>Re: Getting " IOException: Failed to replace a bad datanode" while executing MapReduce Jobs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166449#M24997</link>
      <description>&lt;P&gt;Thanks for the suggestions.Two of the data nodes in the cluster had to be replaced, as it didn't have enough disk space. I have also set the below in hdfs configuration and the jobs started executing fine even though I have noticed "Premature end of fail" error in data node logs.&lt;/P&gt;&lt;PRE&gt;dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
&lt;/PRE&gt;</description>
      <pubDate>Thu, 14 Apr 2016 17:17:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Getting-quot-IOException-Failed-to-replace-a-bad-datanode/m-p/166449#M24997</guid>
      <dc:creator>phoncy_joseph</dc:creator>
      <dc:date>2016-04-14T17:17:37Z</dc:date>
    </item>
  </channel>
</rss>

