Reply
Highlighted
Explorer
Posts: 27
Registered: ‎05-15-2015
Accepted Solution

Oozie custom action deployment on Cloudera 5.4.x clusters running Cloudera Manager

Our environment runs multiple Cloudera 5.4.x clusters, each running Cloudera Manager.

We would like to deploy an oozie custom action, and automate it as much as possible.  For development we uave used the Cloudera Quickstart vm and do roughly the following steps:

1) Stop oozie and it's dependiencies - Use cloudera manager to stop Hue first then Oozie (since hue depends on oozie).

2) Configure oozie and deploy jars that launch the custom action.

2a)For the initial install only, register the custom action with oozie iusing Cloudera Manager (in particular identify the xml schema and the corresponding class).

2b) On the intial install and for every upgrade, deploy the jars required to launch the costom action to /var/lib/oozie.

3) Restart oozie and its dependencies - Use cloudera manager to restart oozie then hue.

 

I would like to have command line automation for steps 1, 2a and 3 if possible, epecially steps 1 and 3.

 

For steps 1 and 3 (stopping and restarting oozie), is there a command line interface suitable for inclusion into the a rpm's shell scripts (invoked via rpm), in particular it seems taht using the command line "service" interface is contraindicated as per http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_services_stop.ht...

 

For step 2a) I'm not sure if I can do this via a script, the changes are not being reflected in the oozie-site.xml file ( /etc/oozie/conf.dist/oozie-site.xml) does not seem to be updated or accessed by oozie.

 

Posts: 1,836
Kudos: 415
Solutions: 295
Registered: ‎07-31-2013

Re: Oozie custom action deployment on Cloudera 5.4.x clusters running Cloudera Manager

For all of these steps, you can make use of CM's API and other extension features.

The CM API allows you to use REST API (via curl cli commands, python or java programs, etc.) to set configuration, restart clusters, services or instances, and a lot more. This is documented with some examples at the CM API website: http://cloudera.github.io/cm_api/

For deploying custom jars easily, you can also consider writing a custom parcel (if you already use parcels and not RPM/DEB packages to run CDH in CM). Documentation on what parcels are and how to write a parcel is at https://github.com/cloudera/cm_ext/wiki/Parcels:-What-and-Why%3F, https://github.com/cloudera/cm_ext/wiki/The-parcel-format and https://github.com/cloudera/cm_ext/wiki/Building-a-parcel

One example Oozie-related parcel I have in my personal repo is one that installs the extjs UI libraries in an automatic manner. You can reference it for building your own custom jar deployment parcel: https://github.com/QwertyManiac/extjs-parcel

Does this help?
Explorer
Posts: 27
Registered: ‎05-15-2015

Re: Oozie custom action deployment on Cloudera 5.4.x clusters running Cloudera Manager

Thanks for the help.  Regarding the API and the documentation, I've had a chance to look over it and stopping, starting/restarting the oozie server and its dependencies are fairly streigth forward.  However, I'm not quite clear on how to update the properties to register the custom actions classes via the

Oozie ActionService Executor Extension Classes (oozie.service.ActionService.executor.ext.classes) and  the Oozie SchemaService Workflow Extension Schemas
 (oozie.service.SchemaService.wf.ext.schemas).  I've tried several different approaches via the python API as seen at http://cloudera.github.io/cm_api/docs/python-client/,  however so far I have had no luck.  If anyone knows how to do this, I'd appreaciate it.

Posts: 1,836
Kudos: 415
Solutions: 295
Registered: ‎07-31-2013

Re: Oozie custom action deployment on Cloudera 5.4.x clusters running Cloudera Manager

I've posted a reply on your other thread opened for this topic: http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Cloudera-5-4-x-Oozie-Custom-Action-py...

Lets carry on there, and mark this thread resolved (for benefit of others looking for the same thing)?
Cloudera Employee
Posts: 14
Registered: ‎05-27-2014

Re: Oozie custom action deployment on Cloudera 5.4.x clusters running Cloudera Manager

The following steps can be done to get/set configurations: ==== Oozie ActionService Executor Extension Classes ==== >>> from cm_api.api_client import ApiResource >>> print ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].get_config(view='full')['oozie_executor_extension_classes'] : oozie_executor_extension_classes = none >>> print ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].update_config({'oozie_executor_extension_classes':'oozie_test.class'}) >>> print ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].get_config(view='full')['oozie_executor_extension_classes'] : oozie_executor_extension_classes = oozie_test.class ==================== ==== Oozie SchemaService Workflow Extension Schemas ==== >>> from cm_api.api_client import ApiResource >>> print ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].get_config(view='full')['oozie_workflow_extension_schemas']: oozie_workflow_extension_schemas = ssh-action-0.1.xsd,hive-action-0.3.xsd,sqoop-action-0.3.xsd,shell-action-0.2.xsd,shell-action-0.1.xsd >>> ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].update_config({'oozie_workflow_extension_schemas':'ssh-action-0.1.xsd,hive-action-0.3.xsd,sqoop-action-0.3.xsd,shell-action-0.2.xsd,shell-action-0.1.xsd,oozie-test-action.xsd'}) >>> print ApiResource('nightly54-1.vpc.cloudera.com').get_all_clusters()[0].get_all_services()[4].get_all_roles()[0].get_config(view='full')['oozie_workflow_extension_schemas']: oozie_workflow_extension_schemas = ssh-action-0.1.xsd,hive-action-0.3.xsd,sqoop-action-0.3.xsd,shell-action-0.2.xsd,shell-action-0.1.xsd,oozie-test-action.xsd =================== Hardcoded value used for method such as "get_all_clusters()[0]" for brevity. A for-loop would be needed to parse for specific value and return the object for the next call, etc... [1]. For future reference, all the modules can be found at ".../cm_api/endpoints." [1] http://cloudera.github.io/cm_api/docs/python-client
Announcements