Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Deleting a dead host from Ambari

avatar
Contributor

Hi all,

 

I'm have a 9 host cluster on HDP 2.6.5.  Recently, a host went down and had to be rebuilt.  It was hosting ZK server, DataNode and YARN server.  Can I safely delete the host from Ambari web UI as I'm unable to turn off the services?

 

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@mikelok 

Before you embark on that there are a couple of questions? Have you moved or recreated the zookeeper , YARN, and Datanode on another node?

 

You should be having 3 zookeepers at least. and how about your YARN server? Usually, when a host crashed and stops sending the heartbeat after a period of time it's excluded from the health nodes.

Procedure

1. Decommission DataNodes:

  1. From the node hosting the NameNode, edit the $HADOOP_CONF_DIR/dfs. exclude file by adding the list of DataNode hostnames, separated by a newline character.
  2. Update the NameNode with the new set of excluded DataNodes. Run the following command from the NameNode machine:
    # su $HDFS_USER
    $ hdfs dfsadmin -refreshNodes
    $HDFS_USER is the user that owns the HDFS services, which is usually hdfs.
  3. Open the NameNode web interface, and go to the DataNodes page:
    http://<abc.my_namenode.com>:50070. Verify that the state is changed to Decommission in Progress for the DataNodes that are being decommissioned.
  4. Shut down the decommissioned nodes when all of the DataNodes are decommissioned. All of the blocks will already be replicated.
  5. If you use a dfs.include file on your cluster, remove the decommissioned nodes from that file on the NameNode host machine. Then refresh the nodes on that machine:
    # su $HDFS_USER
    $ hdfs dfsadmin -refreshNodes
    If no dfs.include is used, all DataNodes are considered included in the cluster, unless a node exists in a $HADOOP_CONF_DIR/dfs.exclude file.

You can also use the Ambari REST API to achieve that  here is a reference

 

https://cwiki.apache.org/confluence/display/AMBARI/Using+APIs+to+delete+a+service+or+all+host+compon...

 

Hope that helps

 

 

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@mikelok 

Before you embark on that there are a couple of questions? Have you moved or recreated the zookeeper , YARN, and Datanode on another node?

 

You should be having 3 zookeepers at least. and how about your YARN server? Usually, when a host crashed and stops sending the heartbeat after a period of time it's excluded from the health nodes.

Procedure

1. Decommission DataNodes:

  1. From the node hosting the NameNode, edit the $HADOOP_CONF_DIR/dfs. exclude file by adding the list of DataNode hostnames, separated by a newline character.
  2. Update the NameNode with the new set of excluded DataNodes. Run the following command from the NameNode machine:
    # su $HDFS_USER
    $ hdfs dfsadmin -refreshNodes
    $HDFS_USER is the user that owns the HDFS services, which is usually hdfs.
  3. Open the NameNode web interface, and go to the DataNodes page:
    http://<abc.my_namenode.com>:50070. Verify that the state is changed to Decommission in Progress for the DataNodes that are being decommissioned.
  4. Shut down the decommissioned nodes when all of the DataNodes are decommissioned. All of the blocks will already be replicated.
  5. If you use a dfs.include file on your cluster, remove the decommissioned nodes from that file on the NameNode host machine. Then refresh the nodes on that machine:
    # su $HDFS_USER
    $ hdfs dfsadmin -refreshNodes
    If no dfs.include is used, all DataNodes are considered included in the cluster, unless a node exists in a $HADOOP_CONF_DIR/dfs.exclude file.

You can also use the Ambari REST API to achieve that  here is a reference

 

https://cwiki.apache.org/confluence/display/AMBARI/Using+APIs+to+delete+a+service+or+all+host+compon...

 

Hope that helps

 

 

avatar
Contributor

Hi Shelton,

 

The steps you provided worked perfectly.

 

Thanks!