Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

DATANODE + Failed to replace a bad datanode on the existing pipeline

avatar

hi all

we have ambari cluster with 4 datanode machine ( workers ) , and on each worker machine we have 1 disk of 1T size

before I explain the problem I want to clear that we verify the following and we not see any problem on the following subject

1 cluster is working without network problem

2 we check the DNS and resolving hostname is correctly

3 java heap size on HDFS increase to 8G ( so no problem with java heap size )

5. we checked the HDFS service check and no issue with that

6. we set the following:

To resolve this issue, we set the following two properties from Ambari > HDFS > Configs > Custom HDFS site > Add Property:

dfs.client.block.write.replace-datanode-on-failure.enable=NEVER

dfs.client.block.write.replace-datanode-on-failure.policy=NEVER

but we still have the problem



NOW - lets talk about the problem:

on one of the worker machine we see that

 tail -f /grid/sdb/hadoop/yarn/log/application_1523836627832749_4432/container_e23_1592736529519_4432_01_000041/stderr


---2018-07-12T20:51:28.028 ERROR [driver][][] [org.apache.spark.scheduler.LiveListenerBus] Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[45.23.2.56:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[45.23.2.56:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]], original=[DatanodeInfoWithStorage[45.23.2.56:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[45.23.2.56:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1059)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1122)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1280)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1005)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:512)

we can saw the error about - java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available

what we can do else in order to resolved the failed "Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available" ?

Michael-Bronson
13 REPLIES 13

avatar
Master Mentor

@Michael Bronson

Below is the datanode policy and the option NEVER is as stated not desirable to eliminate that error look at the other options

Recovery from Pipeline upon Failure

There are four configurable policies regarding whether to add additional DataNodes to replace the bad ones when setting up a pipeline for recovery with the remaining DataNodes:

  1. DISABLE: Disables DataNode replacement and throws an error (at the server); this acts like NEVER at the client.
  2. NEVER: Never replace a DataNode when a pipeline fails (generally not a desirable action).
  3. DEFAULT: Replace based on the following conditions:
    1. Let r be the configured replication number.
    2. Let n be the number of existing replica datanodes.
    3. Add a new DataNode only if r >= 3 and EITHER
      • floor(r/2) >= n; OR
      • r > n and the block is hflushed/appended.
  4. ALWAYS: Always add a new DataNode when an existing DataNode failed. This fails if a DataNode can’t be replaced.

To disable using any of these policies, you can set the following configuration property to false (the default is true):

dfs.client.block.write.replace-datanode-on-failure.enable

When enabled, the default policy is DEFAULT. The following config property changes the policy:

dfs.client.block.write.replace-datanode-on-failure.policy

When using DEFAULT or ALWAYS, if only one DataNode succeeds in the pipeline, the recovery will never succeed and client will not be able to perform the write. This problem is addressed with this configuration property:

dfs.client.block.write.replace-datanode-on-failure.best-effort

which defaults to false. With the default setting, the client will keep trying until the specified policy is satisfied. When this property is set to true, even if the specified policy can’t be satisfied (for example, there is only one DataNode that succeeds in the pipeline, which is less than the policy requirement), the client will still be allowed to continue to write.

HTH

avatar

@Geoffrey so let summray all options , as I understand the best option in my case is option number - 4 and I need to set the variable - dfs.client.block.write.replace-datanode-on-failure.best-effort to true

am I right?

additional I need also to delete the following settings

dfs.client.block.write.replace-datanode-on-failure.enable=NEVER

dfs.client.block.write.replace-datanode-on-failure.policy=NEVER

Michael-Bronson

avatar
Cloudera Employee

@Michael Bronson The scenario is very well explained in the below article

https://community.hortonworks.com/articles/16144/write-or-append-failures-in-very-small-clusters-un....

Are you seeing any datanode failures at the time of the issue?

avatar
Master Mentor

@Michael Bronson

The response is YES.

To disable using any of the policies, you can set the following configuration property to false (the default is true):

dfs.client.block.write.replace-datanode-on-failure.enable 

When enabled, the default policy is DEFAULT. The following config property changes the policy:

dfs.client.block.write.replace-datanode-on-failure.policy 

When using DEFAULT or ALWAYS, if only one DataNode succeeds in the pipeline, the recovery will never succeed and client will not be able to perform the write. This problem is addressed with this configuration property:

dfs.client.block.write.replace-datanode-on-failure.best-effort 

which defaults to false. With the default setting, the client will keep trying until the specified policy is satisfied.

When this property is set to true, even if the specified policy can’t be satisfied (for example, there is only one DataNode that succeeds in the pipeline, which is less than the policy requirement), the client will still be allowed to continue to write.

Please revert !

HTH

avatar

@Geoffrey I set the following in Custom hdfs-site , per your recommendation , and we restart the HDFS service

dfs.client.block.write.replace-datanode-on-failure.best-effort=true

dfs.client.block.write.replace-datanode-on-failure.enable=false


but still the errors still defined from the log ( on the first Datanode machine - worker01 )


---2018-07-15T08:14:49.049 ERROR [driver][][] [org.apache.spark.scheduler.LiveListenerBus] Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]], original=[DatanodeInfoWithStorage[:50010,DS-f5c5260a-20b1-43f4-b8fd-53e88db2e48e,DISK], DatanodeInfoWithStorage[:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1059)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1122)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1280)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1005)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:512)
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Whats the value of your below parameter

dfs.client.block.write.replace-datanode-on-failure.policy

If its NEVER that means never add a new datanode and that is the cause of the error stack !!

HTH

avatar

I removed this variable from the Custom hdfs-site,

Michael-Bronson

avatar

should I set this variable to true ? ( dfs.client.block.write.replace-datanode-on-failure.policy=true ) , in Custom hdfs-site

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

The ONLY valid options for dfs.client.block.write.replace-datanode-on-failure.policy are if you see from my initial posting are

  • DISABLE
  • NEVER
  • DEFAULT
  • ALWAYS

When using DEFAULT or ALWAYS, if only one DataNode succeeds in the pipeline, the recovery will never succeed and client will not be able to perform the write. This problem is addressed with this configuration property

dfs.client.block.write.replace-datanode-on-failure.best-effort

which defaults to false. With the default setting, the client will keep trying until the specified policy is satisfied. When this property is set to true, even if the specified policy can’t be satisfied (for example, there is only one DataNode that succeeds in the pipeline, which is less than the policy requirement), the client will still be allowed to continue to write

So I suggest you do the follwing modification and restart the start configs and revert

dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
dfs.client.block.write.replace-datanode-on-failure.best-effort=true

Please let me know