<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question DATANODE + Failed to replace a bad datanode on the existing pipeline in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/DATANODE-Failed-to-replace-a-bad-datanode-on-the-existing/m-p/184142#M146289</link>
    <description>&lt;P&gt;hi all&lt;/P&gt;&lt;P&gt;we have ambari cluster with 4 datanode machine ( workers ) ,
and on each worker machine we have 1 disk of 1T size&lt;/P&gt;&lt;P&gt;before I explain the problem I want to clear that we verify
the following and we not see any problem on the following subject&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;1 cluster is working without network problem&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;2 we check the DNS and resolving hostname is correctly&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;3 java heap size on HDFS increase to 8G ( so no problem with
java heap size )&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;5. we checked the HDFS service check and no issue with that&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;6. we set the following:&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;To resolve this issue, we set the following two properties
from Ambari &amp;gt; HDFS &amp;gt; Configs &amp;gt; Custom HDFS site &amp;gt; Add Property:&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;dfs.client.block.write.replace-datanode-on-failure.enable=NEVER&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;dfs.client.block.write.replace-datanode-on-failure.policy=NEVER&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;but we still have the problem&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;NOW - lets talk about the problem:&lt;/P&gt;&lt;P&gt;on one of the worker machine we see that&lt;/P&gt;&lt;PRE&gt; tail -f /grid/sdb/hadoop/yarn/log/application_1523836627832749_4432/container_e23_1592736529519_4432_01_000041/stderr


---2018-07-12T20:51:28.028 ERROR [driver][][] [org.apache.spark.scheduler.LiveListenerBus] Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[45.23.2.56:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[45.23.2.56:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]], original=[DatanodeInfoWithStorage[45.23.2.56:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[45.23.2.56:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1059)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1122)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1280)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1005)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:512)
&lt;/PRE&gt;&lt;P&gt;  we
can saw the error about - j&lt;STRONG&gt;ava.io.IOException: Failed to replace a bad datanode
on the existing pipeline due to no more good datanodes being available &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;  what
we can do else in order to resolved the failed "&lt;STRONG&gt;Failed to replace a bad
datanode on the existing pipeline due to no more good datanodes being
available&lt;/STRONG&gt;"  ?&lt;/P&gt;</description>
    <pubDate>Fri, 13 Jul 2018 04:13:36 GMT</pubDate>
    <dc:creator>mike_bronson7</dc:creator>
    <dc:date>2018-07-13T04:13:36Z</dc:date>
  </channel>
</rss>

