Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Automate CM Express Wizard installation via API?

Automate CM Express Wizard installation via API?

New Contributor

Hello,

 

I am working at Hadapt, Cloudera patrner company, on a product that builds on top of Cloudera CDH. For testing we would like to be able to create in automation clusters with different kinds of full Cloudera deployments, as created and configured by Cloudera Manager. For example, we would like to be able to deploy CDH 4.3 and 4.4 using RPM and Parcels into different hardware environments in a fully automated way.

 

We believe most users will be using the Express Wizard to install clusters. Is there any way to simply automate the equivalent of an Express Wizard install via CM without needing to interact with the GUI?

 

For example, I am able to install Cloudera Manager the way a customer would using:

 

cloudera-manager-installer.bin --i-agree-to-all-licenses --noprompt --noreadme --nooptions

 
I have been attempting to do so using the v5 API, but find that I need to do a lot of things outside of the API, such as setting up the agents on the cluster nodes via packages, or trying to set possibly unsettable things like zookeeper-autocreate-dirs or having to manually format HDFS. (I see that the v6 API includes support for doing a hostInstall. While necessary, this I think would also not be sufficient, as it does not cover the wizard side of things. Also, we would like to replicate the experience a customer would get when using CM 4.x., not the CM 5 beta.)

 

While I imagine I can eventually figure out how to make the cm-api calls work, I worry it will not completely replicate the customer experience.
 
Have I missed some piece of documentation that explains how to do this? Or are there perhaps some undocumented calls that we could take advantage of? There must be something that Cloudera uses internally deploy for testing...
 
Thanks for any help!
 
 
 

 

5 REPLIES 5
Highlighted

Re: Automate CM Express Wizard installation via API?

Hi emilsit,

 

You're on the right track.

 

The general flow of installation should go:

1) Install cm binaries on server host (may also need to install dependencies like java or database)

2) Install cm agents on cluster hosts (as you pointed out, this is only possible via CM API in CM5)

3) Replicate every configuration and command performed by the CM wizard through API. Everything config and command involved is exposed in API (our internal testing automation uses this). You can even go further and add things like enabling NN HA or JT HA.

 

Note that it is not possible to get CM recommendations via the API. You will need to determine all configuration manually. It may help to run the wizard through the UI, then capture what CM configures and repeat it in your API scripts. The "deployment" endpoint in the API can be very useful for this.

 

Also don't forget to distribute install JDBC drivers on all cluster hosts if using mysql or postgres.

 

Thanks,

Darren

Re: Automate CM Express Wizard installation via API?

New Contributor

Darren,

 

Thanks for the quick reply.

 

Manually re-creating the behavior seemed somewhat error-prone because the "automated" version that the ExpressWizard does may change without us noticing it. (We have even considered using Selenium or something to drive the wizard remotely, though that seems frought with its own head-aches.)

 

I had not noticed the deployment end point, thanks for the pointer. It looks like the deployment end point is not available in the Python API. From skimming the Java code, it looks like it is functionally similar to this dump script that I threw together last week. https://gist.github.com/sit/7208850.

 

Do you have some script that automatically re-stores via the API all the settings found in a deployment?

 

Thanks.

 

Re: Automate CM Express Wizard installation via API?

You can put to the cm deployment API and it will create a cluster as specified in the json. See http://cloudera.github.io/cm_api/apidocs/v6/path__cm_deployment.html (available since v1 or v2 of the API, I forget)

 

Looks like deployment is missing from the python bindings, as you pointed out. You'll have to manually access the URL, or enhance the python bindings = ). It's in the Java bindings.

 

Using deployment won't run the commands for you, but is one way of quickly creating a cluster with the desired role assignments and configuration.

 

While it's true that steps change a little over time, generally CM will add steps and ocnfigs and not remove them, and your old workflows will work as well as they used to because we try to maintain API compatibility. This means that you can usually expect your scripts to keep working and only update them if you want to take advantage of a new feature, even when CM version changes. One notable exception to this will be around Impala, however, which will get a new mandatory role soon, even for CDH4. Keep an eye out for that when CM 4.8 comes out. Also we'll make some minor incompatible changes like deleting deprecated / refactored / unused configs in CM 5. Other partners are using the CM API effectively to automate deployments.

 

Thanks,

Darren

Re: Automate CM Express Wizard installation via API?

New Contributor
I see.

So the deployment will not take care of steps that the wizard sequences like initializing zookeeper or formatting hdfs? Does it handle download and distribution of parcels?

It seems like a very simple/common use case for the API here; have any of the other partners released sample code that you know of?

Re: Automate CM Express Wizard installation via API?

Initializing zookeeper and formatting hdfs are available via the API as commands, just like start, create HDFS /tmp dir, create hive metastore tables, etc.

 

Deployment only handles configuration, not binary distribution, so it won't handle parcels for you. There are also API commands for that (I recently added parcel examples to the CM API docs at http://cloudera.github.io/cm_api/docs/python-client/).

 

Unfortunately, I'm not aware of any sample code out there.