Support Questions

Hefei · ‎02-22-2016

Some disks failed in my HDFS cluster. and nodemanager cannot start in these nodes.

How to fix it ?

rahulpathak109 · ‎02-22-2016

In Namenode UI check and ensure that there are no missing and corrupt blocks. If this is true then you can successfully remove failed disk from DataNode.

Refer this for details

View solution in original post

divakarreddy_a · ‎02-22-2016

Is multiple disks on the same node?

if yes, I think you can decommission the node from Ambari->hosts->data node -> you will find decommission from drop down.

Hefei · ‎02-22-2016

Thanks for quick reply.

jstraub · ‎02-22-2016

Take a look at this questions, maybe it is helpful => https://community.hortonworks.com/questions/3012/what-are-the-steps-an-operator-should-take-to-repl....

rahulpathak109 · ‎02-22-2016

In Namenode UI check and ensure that there are no missing and corrupt blocks. If this is true then you can successfully remove failed disk from DataNode.

Refer this for details

Hefei · ‎02-22-2016

Thanks for quick reply.

bleonhardi · ‎02-22-2016

Can you check the setting for

dfs.datanode.failed.volumes.tolerated

in your environment. Default is 0 which is a bit restrictive. Normally 1 or even 2 ( on datanodes with high disc density ) make more operational sense.

Then your datanode will start and you can take care of the discs.

aervits · ‎02-22-2016

Iif you put this machine in a separate config group and remove referencw to the directories used you can keep the machine up. Removing disk and not replacing will mean your data will be writing to OS filesystem. Also do what Benjamin siggests and increase tolerance.

Cloudera Community

Support Questions

How to remove risk disks from Hadoop cluster ?