Support Questions

Find answers, ask questions, and share your expertise

Cloudera 5.4.x Oozie Custom Action python to configure classes and xml schema

avatar
Explorer

I am creating a script to automate the deployment of an oozie custom action via the cloudera manager usingthe cloudera supployed python API as seen at 

http://cloudera.github.io/cm_api/docs/python-client/, but am considering the python requests library as a fallback option if needed.

 

However after reviewing the supplied documentation, it is still unclear to me how to use the interface to do the updates for the following:

 

  1. Oozie ActionService Executor Extension Classes, and
  2. Oozie SchemaService Workflow Extension Schemas

I happen to test using the Quick Start VM’s, where the parameters can be seen at http://quickstart.cloudera:7180/cmf/services/2/config, if you use the Category filters on the left of the browser, selecting “Advanced” will make it easier to find the parameters.

 

What is the python interface for both reading and writiing these parameters?

 

With best regards:

 

Bill M.

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

I'm not sure when I need to access the configuration from a role (or a group) as opposed to  using the service directly, but based on a separate communication I used the role.    The snippet of code below is based on  a working version  (on a quickstart vm), and looks like:

# peforms the post install oozie service management for the red orbit module
def main():
    cmhost  = 'quickstart.cloudera'
    theUserName = "admin"
    thePassword = "admin"
    apiResource = ApiResource(cmhost, username=theUserName, password=thePassword)
    # A kludgey way of specifying the cluster, but there is only one here
    clusters = apiResource.get_all_clusters()
    # TODO: This is a kludge to resolve the cluster
    if (len(clusters) != 1):
        print "There should  one cluster, but there are " + repr(len(clusters)) + " clusters"
        sys.exit(1)
    cluster = clusters[0]
    # TODO: These parameters are appropriate for the quick start vm, are they right for our deployment?
    hueServiceName = "hue"
    oozieServiceName = "oozie"
    oozieServiceRoleName = 'oozie-OOZIE_SERVER'
    oozieService = cluster.get_service(oozieServiceName)
    hueService =   cluster.get_service(hueServiceName)

    oozieServiceRole = oozieService.get_role(oozieServiceRoleName)

    # TODO: Is this always going tobe the oozie service role's name?  We may need to use the Role Type to be safe
    originalOozieConfig = oozieServiceRole.get_config(view='full') 

    oozieConfigUpdates = { "oozie_executor_extension_classes" : "org.apache.oozie.action.hadoop.MyCustomActionExecutor",
                           "oozie_workflow_extension_schemas" : "my-custom-action-0.1.xsd" }
    for configUpdateKey in oozieConfigUpdates:
        if (configUpdateKey in originalOozieConfig):
            print  repr(configUpdateKey) + " before update has oozie configured value = " + str(originalOozieConfig[configUpdateKey])
        else:
            print repr(configUpdateKey) + " not previously configured in oozie"
    
    # stop the oozie service after stoping any services depending on oozie (i.e. hue)
    for service in [hueService, oozieService]:
        print "stopping service " + repr(service.name)
        service.stop().wait() # synchronous stop
        print "service " + repr(service.name) + " stopped"
    
    # update the configuration while the servers are quiescent
    
    updatedOozieConfig = oozieServiceRole.update_config(oozieConfigUpdates)

    print "updatedOozieConfig = " + repr(updatedOozieConfig)
    
    
    for configUpdateKey in oozieConfigUpdates:
        print 'Config after update for key = ' + repr(configUpdateKey) + " has value = " + repr(updatedOozieConfig[configUpdateKey])

    # restart the oozie service before restarting any services depending on oozie (i.e. hue)
    for service in [oozieService, hueService]:
        print "retarting service " + repr(service.name)
        service.restart().wait() # synchronous restart
        print "service " + repr(service.name) + " restarted"

    # Done!
    return

View solution in original post

2 REPLIES 2

avatar
Mentor

We carry a page in our regular documentation that maps every field you see in CM to their CM API variant names. This page can be found, for the Oozie service for example, at http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_props_cdh540_oozie.h...

 

For your specific two properties, the name mapping can be found by looking at the above page as:

 

1. OozieActionService Executor Extension Classes => oozie_executor_extension_classes

2. OozieSchemaService Workflow Extension Schemas => oozie_workflow_extension_schemas

 

So, following the updating config example at http://cloudera.github.io/cm_api/docs/python-client/#configuring-services-and-roles, but applying for your Oozie properties, the code would roughly look like the below:

 

# Get a handle to the API client
from cm_api.api_client import ApiResource

cm_host = "cm-host"
api = ApiResource(cm_host, username="admin", password="admin")

# Get cluster
cdh = None
for c in api.get_all_clusters():
  print c.name
  if c.version == "CDH5":
    cdh = c

# Get service
oozie = None
for s in cdh.get_all_services():
  print s
  if s.type == "OOZIE":
    oozie = s

oozie.update_config({'oozie_executor_extension_classes': 'com.mycompany.MyClass', 'oozie_workflow_extension_schemas': 'my-class.xsd'})

Does this help?

avatar
Explorer

I'm not sure when I need to access the configuration from a role (or a group) as opposed to  using the service directly, but based on a separate communication I used the role.    The snippet of code below is based on  a working version  (on a quickstart vm), and looks like:

# peforms the post install oozie service management for the red orbit module
def main():
    cmhost  = 'quickstart.cloudera'
    theUserName = "admin"
    thePassword = "admin"
    apiResource = ApiResource(cmhost, username=theUserName, password=thePassword)
    # A kludgey way of specifying the cluster, but there is only one here
    clusters = apiResource.get_all_clusters()
    # TODO: This is a kludge to resolve the cluster
    if (len(clusters) != 1):
        print "There should  one cluster, but there are " + repr(len(clusters)) + " clusters"
        sys.exit(1)
    cluster = clusters[0]
    # TODO: These parameters are appropriate for the quick start vm, are they right for our deployment?
    hueServiceName = "hue"
    oozieServiceName = "oozie"
    oozieServiceRoleName = 'oozie-OOZIE_SERVER'
    oozieService = cluster.get_service(oozieServiceName)
    hueService =   cluster.get_service(hueServiceName)

    oozieServiceRole = oozieService.get_role(oozieServiceRoleName)

    # TODO: Is this always going tobe the oozie service role's name?  We may need to use the Role Type to be safe
    originalOozieConfig = oozieServiceRole.get_config(view='full') 

    oozieConfigUpdates = { "oozie_executor_extension_classes" : "org.apache.oozie.action.hadoop.MyCustomActionExecutor",
                           "oozie_workflow_extension_schemas" : "my-custom-action-0.1.xsd" }
    for configUpdateKey in oozieConfigUpdates:
        if (configUpdateKey in originalOozieConfig):
            print  repr(configUpdateKey) + " before update has oozie configured value = " + str(originalOozieConfig[configUpdateKey])
        else:
            print repr(configUpdateKey) + " not previously configured in oozie"
    
    # stop the oozie service after stoping any services depending on oozie (i.e. hue)
    for service in [hueService, oozieService]:
        print "stopping service " + repr(service.name)
        service.stop().wait() # synchronous stop
        print "service " + repr(service.name) + " stopped"
    
    # update the configuration while the servers are quiescent
    
    updatedOozieConfig = oozieServiceRole.update_config(oozieConfigUpdates)

    print "updatedOozieConfig = " + repr(updatedOozieConfig)
    
    
    for configUpdateKey in oozieConfigUpdates:
        print 'Config after update for key = ' + repr(configUpdateKey) + " has value = " + repr(updatedOozieConfig[configUpdateKey])

    # restart the oozie service before restarting any services depending on oozie (i.e. hue)
    for service in [oozieService, hueService]:
        print "retarting service " + repr(service.name)
        service.restart().wait() # synchronous restart
        print "service " + repr(service.name) + " restarted"

    # Done!
    return