<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207717#M169678</link>
    <description>&lt;DIV&gt;regarding the - DATANODE_PID , how to find it ? ( I guess from the worker machine ? ) &lt;/DIV&gt;</description>
    <pubDate>Wed, 31 Jan 2018 20:22:04 GMT</pubDate>
    <dc:creator>mike_bronson7</dc:creator>
    <dc:date>2018-01-31T20:22:04Z</dc:date>
    <item>
      <title>Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207711#M169672</link>
      <description>&lt;P&gt;&lt;STRONG&gt;need advice why we get the error about &lt;/STRONG&gt;- Failed to replace a bad datanode on the existing pipeline due to no more good datanodes? &lt;/P&gt;&lt;P&gt;I saw also other quastion that talk about my problem  -https://community.hortonworks.com/questions/27153/getting-ioexception-failed-to-replace-a-bad-datano.html&lt;/P&gt;&lt;P&gt;Log  description :&lt;/P&gt;&lt;PRE&gt;java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[34.2.31.31:50010,DS-8234bb39-0fd4-49be-98ba-32080bc24fa9,DISK], DatanodeInfoWithStorage[34.2.31.33:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]], original=[DatanodeInfoWithStorage[34.2.31.31:50010,DS-8234bb39-0fd4-49be-98ba-32080bc24fa9,DISK], DatanodeInfoWithStorage[34.2.31.33:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1036)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1110)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1268)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:993)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:500)
---2018-01-30T15:15:15.015 INFO  [][][] [dal.locations.LocationsDataFramesHandler] &lt;/PRE&gt;</description>
      <pubDate>Wed, 31 Jan 2018 17:07:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207711#M169672</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2018-01-31T17:07:41Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207712#M169673</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@Michael Bronson&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt; The two properties &lt;STRONG&gt;'dfs.client.block.write.replace-datanode-on-failure.policy'&lt;/STRONG&gt; and '&lt;STRONG&gt;dfs.client.block.write.replace-data node-on-failure.enable&lt;/STRONG&gt;' influences the client side behavior for the pipeline recovery and these properties can be added as custom properties in the "hdfs-site" configuration.&lt;BR /&gt;&lt;BR /&gt;Continuous network issues causing or repeated packet drops can lead to such issues. This specially happens when data is being written to any one of the DataNode which is in process of pipelining the data to next  datanode and due to any communicaiton issue it may lead to pipeline failure. It can also happen when HDFS client hangs or observs connection timesout due to some memory contention smaller heap size or ulimits.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;So please check if your DataNodes are healthy and there is no N/W packet drop or communication issue.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 19:52:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207712#M169673</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2018-01-31T19:52:34Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207713#M169674</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@Michael Bronson&lt;/A&gt; &lt;/P&gt;&lt;P&gt;The following article can help in understanding more about these properties: &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/articles/16144/write-or-append-failures-in-very-small-clusters-un.html" target="_blank"&gt;https://community.hortonworks.com/articles/16144/write-or-append-failures-in-very-small-clusters-un.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:02:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207713#M169674</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2018-01-31T20:02:44Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207714#M169675</link>
      <description>&lt;P&gt;@jay please advice what is the best way to check DataNodes are healthy?&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:05:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207714#M169675</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2018-01-31T20:05:23Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207715#M169676</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@Michael Bronson&lt;/A&gt; &lt;/P&gt;&lt;P&gt;May be we can try running some HDFS Service checks form AmbarI UI.&lt;/P&gt;&lt;P&gt;Checking the DataNode logs can also give us some idea like if they are sufferring from Memory limitations or if there are some repeated errors.  We can check the DataNode memory utilization to see if they have enough memory and how much is being used currently.&lt;/P&gt;&lt;PRE&gt;# $JAVA_HOME/bin/jmap -heap $DATANODE_PID&lt;/PRE&gt;&lt;P&gt;- We can also check if the DataNode ports are accessible from other nodes and if there is any communication issue. From one datanode host please check if we can connect to other datanode port.&lt;/P&gt;&lt;PRE&gt;# telnet $DATANODE_HOSTNAME   $DATANODE_PORT&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:11:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207715#M169676</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2018-01-31T20:11:02Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207716#M169677</link>
      <description>&lt;PRE&gt;from hdfs dfsadmin -report

what we can do with this - Missing blocks ( anythuing to do regarding that ? ) 

we got 

&amp;lt;br&amp;gt;hdfs dfsadmin -report
Configured Capacity: 8226130288640 (7.48 TB)
Present Capacity: 8225508617182 (7.48 TB)
DFS Remaining: 8205858544606 (7.46 TB)
DFS Used: 19650072576 (18.30 GB)
DFS Used%: 0.24%
Under replicated blocks: 4
Blocks with corrupt replicas: 0
Missing blocks: 4
Missing blocks (with replication factor 1): 0&lt;/PRE&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:17:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207716#M169677</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2018-01-31T20:17:42Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207717#M169678</link>
      <description>&lt;DIV&gt;regarding the - DATANODE_PID , how to find it ? ( I guess from the worker machine ? ) &lt;/DIV&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:22:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207717#M169678</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2018-01-31T20:22:04Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207718#M169679</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;@Michael Bronson&lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;dfsadmin report might not be very helpful here.&lt;/P&gt;&lt;P&gt;Regarding the DataNode PID we can do any of the following to find out the PID of DataNode:&lt;/P&gt;&lt;PRE&gt;# ps -ef | grep DataNode
(OR)
# cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid &lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;Also if you want to list the ports used by the DataNode then you can run the following command:&lt;/P&gt;&lt;PRE&gt;# netstat -tnlpa | grep `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid`&lt;BR /&gt;(OR)
# netstat -tnlpa | grep $DATANODE_PID
&lt;/PRE&gt;&lt;P&gt;.&lt;BR /&gt;&lt;A rel="user" href="https://community.cloudera.com/users/26229/uribarih.html" nodeid="26229"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:25:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207718#M169679</guid>
      <dc:creator>jsensharma</dc:creator>
      <dc:date>2018-01-31T20:25:34Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207719#M169680</link>
      <description>&lt;PRE&gt;@Jay not see from the output something negetive , or maybe you want to add your opinion

&lt;BR /&gt;/usr/jdk64/jdk1.8.0_112/bin/jmap  -heap 26765
Attaching to process ID 26765, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.112-b15
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 1073741824 (1024.0MB)
   NewSize                  = 209715200 (200.0MB)
   MaxNewSize               = 209715200 (200.0MB)
   OldSize                  = 864026624 (824.0MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 188743680 (180.0MB)
   used     = 13146000 (12.537002563476562MB)
   free     = 175597680 (167.46299743652344MB)
   6.9650014241536455% used
Eden Space:
   capacity = 167772160 (160.0MB)
   used     = 7374968 (7.033317565917969MB)
   free     = 160397192 (152.96668243408203MB)
   4.3958234786987305% used
From Space:
   capacity = 20971520 (20.0MB)
   used     = 5771032 (5.503684997558594MB)
   free     = 15200488 (14.496315002441406MB)
   27.51842498779297% used
To Space:
   capacity = 20971520 (20.0MB)
   used     = 0 (0.0MB)
   free     = 20971520 (20.0MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 864026624 (824.0MB)
   used     = 25506528 (24.324920654296875MB)
   free     = 838520096 (799.6750793457031MB)
   2.952053477463213% used&lt;/PRE&gt;</description>
      <pubDate>Wed, 31 Jan 2018 20:56:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207719#M169680</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2018-01-31T20:56:52Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207720#M169681</link>
      <description>&lt;P&gt;Once the file is corruputed, you cannot recover from this even after setting &lt;STRONG&gt;dfs.client.block.write.replace-datanode-on-failure.policy=NEVER &lt;/STRONG&gt; and restarting HDFS. As a work-around, I created a copy of the file and removed the old one.&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jan 2019 21:47:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Failed-to-replace-a-bad-datanode-on-the-existing-pipeline/m-p/207720#M169681</guid>
      <dc:creator>tanmoy_official</dc:creator>
      <dc:date>2019-01-24T21:47:27Z</dc:date>
    </item>
  </channel>
</rss>

