Support Questions

Find answers, ask questions, and share your expertise

deletion of dead nodes in Ambari

avatar
Contributor

Can we have a shell script to delete the nodes in Ambari which were already deleted in Azure

1 ACCEPTED SOLUTION

avatar
Master Mentor

@nanibigdata1 
Are you talking about Azure Scale Down approach in which Azure deletes the unwanted hosts from the cluster when they are not needed ? Ideally Azure should take care of deleting unwanted hosts from ambari as well.

But if that is not happening then pcan you please help in understanding what is happening?

1. How was the node deleted from Azure? Using scale down approach or just deleted the node manually?

2. Before deletion of a Node from ambari cluster it is needed to first decommission the components running on the node like (Datanode, Nodemanager ..etc), If node has Master components then it needs to be moved to other node first. Then the components are stopped and then the components/host is deleted safely. So is it not happening at your end?

3. Do you see that even after the steps mentioned in point-2 the Host is still being listed in the "hosts" table of ambari OR being listed in the Ambari API call?

 

# curl -u admin:admin -H "X-Requested-By:ambari" -X GET http://$AMBRI_FQDN:8080/api/v1/clusters/$CLUSTER_NAME/hosts

(OR) from ambari Database by running the following query:
# SELECT * FROM hosts;

 

Something Similar but not for Azure Env so please do not follow it as it is:
https://community.cloudera.com/t5/Support-Questions/How-can-I-delete-the-host-from-ambari-server/m-p...

*NOTE:*. If Azure has scaled down the ambari cluster (means removes some nodes from Ambari Cluster) But by any chance that host still running the ambari-agent then the agent might keep sending the registration request & heartbeat to ambari server. So please check if you still see the following kind of messages in your ambari-server.log even after the node is moved out of the ambari cluster.

How usually logging appears in ambari-server.log when a node is deleted from cluster:

 

Decommissioning DATANODE on example.host1.com
Decommissioning NODEMANAGER example.host1.com
Received Delete request for host example.host1.com from cluster ExampleCluster.
Removing hosts [[example.host1.com]] from available hosts on hosts removed event.

 

But after the above messages if you still see the following kind of message appear in ambari-server.log. "Agent is still heartbeating" then it indicates that the Ambari Agent is still running on the node which is removed from the cluster and hence will be keep sending the registration/heartbeat request to ambari server so you might see an entry in the "hosts" table in ambari DB for that host. In this case ideally Azure or your whatever should have stopped the ambari-agent properly on the node immediately after host deletion.

 

HeartBeatHandler:185 - Host: example.host1.com not found. Agent is still heartbeating.
Received host registration, host=[hostname=example.host1.com,.............,agentVersion=2.x.y.x

TopologyManager.onHostRegistered: Entering

 

So if you still see that even after the Azure Node deletion it is keep showing the old host in ambari then it might be because the agent was keep running even after deletion for some time.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@nanibigdata1 
Are you talking about Azure Scale Down approach in which Azure deletes the unwanted hosts from the cluster when they are not needed ? Ideally Azure should take care of deleting unwanted hosts from ambari as well.

But if that is not happening then pcan you please help in understanding what is happening?

1. How was the node deleted from Azure? Using scale down approach or just deleted the node manually?

2. Before deletion of a Node from ambari cluster it is needed to first decommission the components running on the node like (Datanode, Nodemanager ..etc), If node has Master components then it needs to be moved to other node first. Then the components are stopped and then the components/host is deleted safely. So is it not happening at your end?

3. Do you see that even after the steps mentioned in point-2 the Host is still being listed in the "hosts" table of ambari OR being listed in the Ambari API call?

 

# curl -u admin:admin -H "X-Requested-By:ambari" -X GET http://$AMBRI_FQDN:8080/api/v1/clusters/$CLUSTER_NAME/hosts

(OR) from ambari Database by running the following query:
# SELECT * FROM hosts;

 

Something Similar but not for Azure Env so please do not follow it as it is:
https://community.cloudera.com/t5/Support-Questions/How-can-I-delete-the-host-from-ambari-server/m-p...

*NOTE:*. If Azure has scaled down the ambari cluster (means removes some nodes from Ambari Cluster) But by any chance that host still running the ambari-agent then the agent might keep sending the registration request & heartbeat to ambari server. So please check if you still see the following kind of messages in your ambari-server.log even after the node is moved out of the ambari cluster.

How usually logging appears in ambari-server.log when a node is deleted from cluster:

 

Decommissioning DATANODE on example.host1.com
Decommissioning NODEMANAGER example.host1.com
Received Delete request for host example.host1.com from cluster ExampleCluster.
Removing hosts [[example.host1.com]] from available hosts on hosts removed event.

 

But after the above messages if you still see the following kind of message appear in ambari-server.log. "Agent is still heartbeating" then it indicates that the Ambari Agent is still running on the node which is removed from the cluster and hence will be keep sending the registration/heartbeat request to ambari server so you might see an entry in the "hosts" table in ambari DB for that host. In this case ideally Azure or your whatever should have stopped the ambari-agent properly on the node immediately after host deletion.

 

HeartBeatHandler:185 - Host: example.host1.com not found. Agent is still heartbeating.
Received host registration, host=[hostname=example.host1.com,.............,agentVersion=2.x.y.x

TopologyManager.onHostRegistered: Entering

 

So if you still see that even after the Azure Node deletion it is keep showing the old host in ambari then it might be because the agent was keep running even after deletion for some time.

avatar
Contributor

Can we use Ambari_helper concept to write the script to delete dead nodes from Ambari.

Can please guide where Can  I get details about Ambari_helper classes?