Some disks failed in my HDFS cluster. and nodemanager cannot start in these nodes.
How to fix it ?
Is multiple disks on the same node?
if yes, I think you can decommission the node from Ambari->hosts->data node -> you will find decommission from drop down.
Can you check the setting for
in your environment. Default is 0 which is a bit restrictive. Normally 1 or even 2 ( on datanodes with high disc density ) make more operational sense.
Then your datanode will start and you can take care of the discs.
Iif you put this machine in a separate config group and remove referencw to the directories used you can keep the machine up. Removing disk and not replacing will mean your data will be writing to OS filesystem. Also do what Benjamin siggests and increase tolerance.