Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

how to findout which api call to use for updating a specific configuration setting (swagger api v32)

avatar
Explorer

I just started using cm_client python for swagger v32. I wonder if I missed something because I found some simple task seemed quite hard to accomplish as a new API user.

 

For example, I am trying to use the API to set a Yarn configuration mapreduce.job.split.metainfo.maxsize. In CM GUI I can go to YARN -   Configuration - search for such a setting and then update it, quite simple. However, in order to do it in API, I was not sure which "update_config" I should use because there are many of them. I tried "serviceResourceAPi.update_service_config( )" for yarn service but the specific setting was not there. then I thought I should try update_role_config or update_role_config_group. Then I realized I had to inpect the many "read_config" output manually to find out which API may be relavent to the parameter. I could not do it programly because I was not sure  how the  setting/parameter would be  called in API - maybe  mapreduce.job.split.metainfo.maxsize would be called mapreduce_job_split_metainfo_max_size or something else.

 

Am I on a wrong path? how does savy developers find out which API to use to update each specific parameters?

 

Thanks

1 ACCEPTED SOLUTION

avatar
Super Guru

@Kevin_Z,

 

The trick is knowing which structure stores the values.  In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:

 

(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full

 

You can see the configuration:

 

{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},

 

See the comparable command for the REST call above:

 

https://archive.cloudera.com/cm6/6.2.0/generic/jar/cm_api/swagger-html-sdk-docs/python/docs/RoleConf...

 

Cloudera Manager the following:

 

  • Service Configuration (configuration items apply to every role for that service)
  • Role Configuration Groups (Each role will use a role configuration group which defines configuration for all roles belonging to that role config group
  • Overrides (A configuration value can be set for a specific role (overriding other configurations)

 

In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type.  The default value is: 10000000

 

Via Python, I used the following to ad a "value" that was not None:

 

#!/usr/bin/env python

import cm_client
from cm_client.rest import ApiException
from pprint import pprint

# Configure HTTP basic authorization: basic
cm_client.configuration.username = 'admin'
cm_client.configuration.password = 'lizard'
cm_client.configuration.verify_ssl = True
# Path of truststore file in PEM
cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem'

# Create an instance of the API class
api_host = 'https://host-10-17-100-224.coe.cloudera.com'
port = '7183'
api_version = 'v30'
# Construct base URL for API
# http://cmhost:7180/api/v30
api_url = api_host + ':' + port + '/api/' + api_version
api_client = cm_client.ApiClient(api_url)
cluster_api_instance = cm_client.ClustersResourceApi(api_client)

# This lists all clusters, so if you have more than one, set "cluster_name" to it.  Otherwise, last one wins
api_response = cluster_api_instance.read_clusters(view='SUMMARY')
for cluster in api_response.items:
    cluster_name = cluster.name

# Get a list of all services and set myserive if the service type is "YARN"
services_api_instance = cm_client.ServicesResourceApi(api_client)
services = services_api_instance.read_services(cluster.name, view='FULL')
for service in services.items:
    if service.type == 'YARN':
        myservice = service

# Get a list of roles and get the GATEWAY type
roles_api_instance = cm_client.RolesResourceApi(api_client)
roles = roles_api_instance.read_roles(cluster_name, myservice.name)
for role in roles.items:
    if role.type == 'GATEWAY':
        myrole = role
# default is summary... this is only relevant if printing out information since full shows all values
# regardless if they have a non-default value or not
view = 'full'

# Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups
# Sets yarnGatewayBase to be the name of the role config group we want to update
rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client)
rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name)
pprint(rcg_api_response)
for cfg in rcg_api_response.items:
  if cfg.name == "yarn-GATEWAY-BASE":
       yarnGatewayBase = cfg.name

# read the config and iterate over each config till we find the one we want
# print out the config before update
yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view)
for config in yarn_gateway_base.items:
    if config.related_name == 'mapreduce.job.split.metainfo.maxsize':
       print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n"
       pprint(config)
       print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n"
       config_to_update = config.name

# Set message and new config values where "value" is what we want the new value to be
# use update_config() to update the value
message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional)
new_config = cm_client.ApiConfig(name=config_to_update, value="20000000")
new_config_list = cm_client.ApiConfigList([new_config])
try:
    # Updates the config for the given role config group.
    res  = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list)
    pprint(res)
except ApiException as e:
    print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)

If you have questions about anything, let us know.

 

View solution in original post

3 REPLIES 3

avatar
Super Guru

@Kevin_Z,

 

The trick is knowing which structure stores the values.  In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:

 

(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full

 

You can see the configuration:

 

{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},

 

See the comparable command for the REST call above:

 

https://archive.cloudera.com/cm6/6.2.0/generic/jar/cm_api/swagger-html-sdk-docs/python/docs/RoleConf...

 

Cloudera Manager the following:

 

  • Service Configuration (configuration items apply to every role for that service)
  • Role Configuration Groups (Each role will use a role configuration group which defines configuration for all roles belonging to that role config group
  • Overrides (A configuration value can be set for a specific role (overriding other configurations)

 

In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type.  The default value is: 10000000

 

Via Python, I used the following to ad a "value" that was not None:

 

#!/usr/bin/env python

import cm_client
from cm_client.rest import ApiException
from pprint import pprint

# Configure HTTP basic authorization: basic
cm_client.configuration.username = 'admin'
cm_client.configuration.password = 'lizard'
cm_client.configuration.verify_ssl = True
# Path of truststore file in PEM
cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem'

# Create an instance of the API class
api_host = 'https://host-10-17-100-224.coe.cloudera.com'
port = '7183'
api_version = 'v30'
# Construct base URL for API
# http://cmhost:7180/api/v30
api_url = api_host + ':' + port + '/api/' + api_version
api_client = cm_client.ApiClient(api_url)
cluster_api_instance = cm_client.ClustersResourceApi(api_client)

# This lists all clusters, so if you have more than one, set "cluster_name" to it.  Otherwise, last one wins
api_response = cluster_api_instance.read_clusters(view='SUMMARY')
for cluster in api_response.items:
    cluster_name = cluster.name

# Get a list of all services and set myserive if the service type is "YARN"
services_api_instance = cm_client.ServicesResourceApi(api_client)
services = services_api_instance.read_services(cluster.name, view='FULL')
for service in services.items:
    if service.type == 'YARN':
        myservice = service

# Get a list of roles and get the GATEWAY type
roles_api_instance = cm_client.RolesResourceApi(api_client)
roles = roles_api_instance.read_roles(cluster_name, myservice.name)
for role in roles.items:
    if role.type == 'GATEWAY':
        myrole = role
# default is summary... this is only relevant if printing out information since full shows all values
# regardless if they have a non-default value or not
view = 'full'

# Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups
# Sets yarnGatewayBase to be the name of the role config group we want to update
rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client)
rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name)
pprint(rcg_api_response)
for cfg in rcg_api_response.items:
  if cfg.name == "yarn-GATEWAY-BASE":
       yarnGatewayBase = cfg.name

# read the config and iterate over each config till we find the one we want
# print out the config before update
yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view)
for config in yarn_gateway_base.items:
    if config.related_name == 'mapreduce.job.split.metainfo.maxsize':
       print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n"
       pprint(config)
       print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n"
       config_to_update = config.name

# Set message and new config values where "value" is what we want the new value to be
# use update_config() to update the value
message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional)
new_config = cm_client.ApiConfig(name=config_to_update, value="20000000")
new_config_list = cm_client.ApiConfigList([new_config])
try:
    # Updates the config for the given role config group.
    res  = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list)
    pprint(res)
except ApiException as e:
    print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)

If you have questions about anything, let us know.

 

avatar
Explorer

Thank you @bgooley

avatar
New Contributor

Okaee.. but what is the trick, as far as i see - it's just a dumb way of finding manually which structure stores what config. Not handy for automating. Is there any api which can tell which config belongs to which structure ? 

Labels