Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How do we use CM API to gracefully stop and start services running on cluster hosts?

avatar
Explorer

Hi,

 

We are trying without much success to figure out how to use CM API library to gracefully shutdown and then startup services running on specific nodes in the CDH cluster.

 

We need to take down nodes for maintenance for few hours and need to keep the overall cluster up and running.  For example if the node is a data node and is running Hbase region server, then the script for that node should stop region server gracefully, and then stop task tracker and data node daemon running on that node.

 

Browsing through the CM API library, I dont see APIs for us to do this.  I do see some stop and start APIs but cannot figure out how to call them.  We are using Python.

 

Any insights would be appreciated.

1 ACCEPTED SOLUTION

avatar

Hi Pankaj,

 

Recommission is only available in api version v2 and up. you have "/api/v1" in your URL, so it's not there.

 

The api documentation usually says when an endpoint was introduced:

http://cloudera.github.io/cm_api/apidocs/v6/index.html

(API documentation also available in the menus in upper right of your CM server)

 

If you already decommissioned it, it's a good idea to re-commission your HDFS datnode as well, since otherwise you'll have very uneven data node utilization.

 

Thanks,

Darren

View solution in original post

4 REPLIES 4

avatar

Hi Pankaj,

 

For your use-case, you want to use the host decommission command. This will move / re-replicate any data from that node to the rest of your cluster while stopping all roles, which will let you perform maintenance while the rest of the cluster is fully operational.

 

In the python bindings, the class ClouderaManager has the method hosts_decommission which will do exactly what you want.

 

Make sure to call the corresponding recommission command when you want it to re-join the cluster.

 

See the tutorial for general usage of the python bindings, including how to interact with commands:

http://cloudera.github.io/cm_api/

 

Thanks,

Darren

avatar
Explorer
Thanks Darren. We got decommission RegionServer service to work, but recommission command does not work - we get a 404 error when we attempt to run command below.



http://d2phantd05:7180/api/v1/clusters/Cluster%201%20-%20CDH4/services/hbase1/commands/recommission


Note we are not running decommission host, as it would force hfds data to get re-distributed, which is not required.
Any insights you can share on recommission RegionServer CM API will be greatly appreciated.

avatar

Hi Pankaj,

 

Recommission is only available in api version v2 and up. you have "/api/v1" in your URL, so it's not there.

 

The api documentation usually says when an endpoint was introduced:

http://cloudera.github.io/cm_api/apidocs/v6/index.html

(API documentation also available in the menus in upper right of your CM server)

 

If you already decommissioned it, it's a good idea to re-commission your HDFS datnode as well, since otherwise you'll have very uneven data node utilization.

 

Thanks,

Darren

avatar
Explorer

Thanks Darren