Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I tried to decommission the NODEMANAGERS and DATANODES through REST API. I got succeeded with that. I am unable to see decommissioned nodes in ambari dashboard.It is showing all the data nodes are live. What is the reason?

avatar
Expert Contributor

I decommissioned 4 data nodes and node managers out of 8 data nodes and node.managers. I checked dfs.exclude fie, it contains the decommissioned nodes host names. I restarted the Namenode. Still dashboard is showing 8 data nodes live and 4 nodemanagers live. Why it is not effecting data nodes part?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Step 1 : Decommission Nodemanagers from the cluster

Command :

curl -u admin:password -i -H 'X-Requested-By: ambari'-X POST -d '{ "RequestInfo":{ "context":"Decommission NodeManagers", "command":"DECOMMISSION", "parameters":{ "slave_type":"NODEMANAGER", "excluded_hosts":"serf010ext.etops.tllsc.net,serf020ext.etops.tllsc.net,villein010ext.etops.tllsc.net,villein020ext.etops.tllsc.net" }, "operation_level":{ "level":"HOST_COMPONENT", "cluster_name":"Name of the cluster" } }, "Requests/resource_filters":[ { "service_name":"YARN", "component_name":"RESOURCEMANAGER" } ]}' http://ambari_hostname:8080/api/v1/clusters/cluster name/requests

Step 2 : Decommission DataNodes from cluster

Command :

curl -u admin:password -i -H 'X-Requested-By: ambari'-X POST -d '{ "RequestInfo":{ "context":"Decommission DataNodes", "command":"DECOMMISSION", "parameters":{ "slave_type":"DATANODE", "excluded_hosts":"serf010ext.etops.tllsc.net,
serf020ext.etops.tllsc.net, villein010ext.etops.tllsc.net,
villein020ext.etops.tllsc.net" }, "operation_level":{ "level":"HOST_COMPONENT", "cluster_name":"Name of the cluster" } }, "Requests/resource_filters":[ { "service_name":"HDFS", "component_name":"NAMENODE" } ]}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/requests

Step 3 : Stop the Datanode service on each node of decommissioned nodes

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/serf010ext.etops.tllsc.net/host_compo...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/serf020ext.etops.tllsc.net/host_compo...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/villein010ext.etops.tllsc.net/host_co...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/villein020ext.etops.tllsc.net/host_co...

Step 4 : Check under replicated and corrupted blocks on Ambari dashboard. It will show some number.

Step 4 : Restart Standby Namenode

Step 5 : Restart Active Namenode

Step 6 : Check under replicated and corrupted blocks on Ambari dashboard, they should be zero. By restarting Namenodes, it will distribute the blocks on live Data nodes only.

Here serf010ext,serf020ext,villein010ext and villein020ext are the nodes, which are planning to decommission from the cluster.

Thank you.

View solution in original post

9 REPLIES 9

avatar

I believe decommissioned DataNodes are not stopped automatically and they need to be stopped explicitly through API calls. OOTH, decommissioned NodeManagers go down automatically.

avatar
Expert Contributor

How can we stop it then? If you have any example or link, can you please post here?

avatar
Master Mentor

avatar
Expert Contributor

@Artem Ervits Once those are excluded and decommissioned, will not be in STARTED state. How can we stop from your given link. Can you please tell me the procedure to follow while decommissioning?

avatar
Master Mentor

@Ram D I believe you stop the service first and then decommission. That's the way it's done in Ambari. Please refer to this https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Ambari_Users_Guide/content/_deleting_a_h...

avatar
Expert Contributor

I tried to stop the DATANODE and NODEMANAGER service first and then tried to decommission the nodes. I am unable to decommission the nodes even it is not showing decommission the nodes got some internal exception. Then i decommissioned NODEMANAGER and DATANODE respectively using curl commands, then changed DATANODE service to INSTALLED state. Restarted Namenodes to get update of DATANODES live status. It was updated successfully. Before namenode start, i am able to see the corrupted blocks and under replicated blocks. After restart of namenodes, they went to zero. In ambari dash board, i am able to see four nodes live now.

avatar
Master Mentor

it would be nice if you documented the whole procedure and provided it as a solution. @Ram D

avatar
Expert Contributor

Step 1 : Decommission Nodemanagers from the cluster

Command :

curl -u admin:password -i -H 'X-Requested-By: ambari'-X POST -d '{ "RequestInfo":{ "context":"Decommission NodeManagers", "command":"DECOMMISSION", "parameters":{ "slave_type":"NODEMANAGER", "excluded_hosts":"serf010ext.etops.tllsc.net,serf020ext.etops.tllsc.net,villein010ext.etops.tllsc.net,villein020ext.etops.tllsc.net" }, "operation_level":{ "level":"HOST_COMPONENT", "cluster_name":"Name of the cluster" } }, "Requests/resource_filters":[ { "service_name":"YARN", "component_name":"RESOURCEMANAGER" } ]}' http://ambari_hostname:8080/api/v1/clusters/cluster name/requests

Step 2 : Decommission DataNodes from cluster

Command :

curl -u admin:password -i -H 'X-Requested-By: ambari'-X POST -d '{ "RequestInfo":{ "context":"Decommission DataNodes", "command":"DECOMMISSION", "parameters":{ "slave_type":"DATANODE", "excluded_hosts":"serf010ext.etops.tllsc.net,
serf020ext.etops.tllsc.net, villein010ext.etops.tllsc.net,
villein020ext.etops.tllsc.net" }, "operation_level":{ "level":"HOST_COMPONENT", "cluster_name":"Name of the cluster" } }, "Requests/resource_filters":[ { "service_name":"HDFS", "component_name":"NAMENODE" } ]}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/requests

Step 3 : Stop the Datanode service on each node of decommissioned nodes

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/serf010ext.etops.tllsc.net/host_compo...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/serf020ext.etops.tllsc.net/host_compo...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/villein010ext.etops.tllsc.net/host_co...

Command:

curl -u admin:password -i -H
'X-Requested-By: ambari' -X PUT -d '{"HostRoles": {"state":
"INSTALLED"}}' http://ambari_hostname:8080/api/v1/clusters/cluster_name/hosts/villein020ext.etops.tllsc.net/host_co...

Step 4 : Check under replicated and corrupted blocks on Ambari dashboard. It will show some number.

Step 4 : Restart Standby Namenode

Step 5 : Restart Active Namenode

Step 6 : Check under replicated and corrupted blocks on Ambari dashboard, they should be zero. By restarting Namenodes, it will distribute the blocks on live Data nodes only.

Here serf010ext,serf020ext,villein010ext and villein020ext are the nodes, which are planning to decommission from the cluster.

Thank you.

avatar
Explorer

Hi all,

I am wondering if there is a reliable way to tell the completion of a NodeManager decommission?

For the DataNode decommission, I can do so by checking NameNode's log for completion. But it seems that there is no clear message from ResourceManager's log.

Cheers,