Created on 07-18-2019 05:33 AM - edited 09-16-2022 07:31 AM
I just started using cm_client python for swagger v32. I wonder if I missed something because I found some simple task seemed quite hard to accomplish as a new API user.
For example, I am trying to use the API to set a Yarn configuration mapreduce.job.split.metainfo.maxsize. In CM GUI I can go to YARN - Configuration - search for such a setting and then update it, quite simple. However, in order to do it in API, I was not sure which "update_config" I should use because there are many of them. I tried "serviceResourceAPi.update_service_config( )" for yarn service but the specific setting was not there. then I thought I should try update_role_config or update_role_config_group. Then I realized I had to inpect the many "read_config" output manually to find out which API may be relavent to the parameter. I could not do it programly because I was not sure how the setting/parameter would be called in API - maybe mapreduce.job.split.metainfo.maxsize would be called mapreduce_job_split_metainfo_max_size or something else.
Am I on a wrong path? how does savy developers find out which API to use to update each specific parameters?
Thanks
Created 07-19-2019 04:23 PM
The trick is knowing which structure stores the values. In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:
(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full
You can see the configuration:
{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},
See the comparable command for the REST call above:
Cloudera Manager the following:
In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type. The default value is: 10000000
Via Python, I used the following to ad a "value" that was not None:
#!/usr/bin/env python
import cm_client
from cm_client.rest import ApiException
from pprint import pprint
# Configure HTTP basic authorization: basic
cm_client.configuration.username = 'admin'
cm_client.configuration.password = 'lizard'
cm_client.configuration.verify_ssl = True
# Path of truststore file in PEM
cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem'
# Create an instance of the API class
api_host = 'https://host-10-17-100-224.coe.cloudera.com'
port = '7183'
api_version = 'v30'
# Construct base URL for API
# http://cmhost:7180/api/v30
api_url = api_host + ':' + port + '/api/' + api_version
api_client = cm_client.ApiClient(api_url)
cluster_api_instance = cm_client.ClustersResourceApi(api_client)
# This lists all clusters, so if you have more than one, set "cluster_name" to it. Otherwise, last one wins
api_response = cluster_api_instance.read_clusters(view='SUMMARY')
for cluster in api_response.items:
cluster_name = cluster.name
# Get a list of all services and set myserive if the service type is "YARN"
services_api_instance = cm_client.ServicesResourceApi(api_client)
services = services_api_instance.read_services(cluster.name, view='FULL')
for service in services.items:
if service.type == 'YARN':
myservice = service
# Get a list of roles and get the GATEWAY type
roles_api_instance = cm_client.RolesResourceApi(api_client)
roles = roles_api_instance.read_roles(cluster_name, myservice.name)
for role in roles.items:
if role.type == 'GATEWAY':
myrole = role
# default is summary... this is only relevant if printing out information since full shows all values
# regardless if they have a non-default value or not
view = 'full'
# Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups
# Sets yarnGatewayBase to be the name of the role config group we want to update
rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client)
rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name)
pprint(rcg_api_response)
for cfg in rcg_api_response.items:
if cfg.name == "yarn-GATEWAY-BASE":
yarnGatewayBase = cfg.name
# read the config and iterate over each config till we find the one we want
# print out the config before update
yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view)
for config in yarn_gateway_base.items:
if config.related_name == 'mapreduce.job.split.metainfo.maxsize':
print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n"
pprint(config)
print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n"
config_to_update = config.name
# Set message and new config values where "value" is what we want the new value to be
# use update_config() to update the value
message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional)
new_config = cm_client.ApiConfig(name=config_to_update, value="20000000")
new_config_list = cm_client.ApiConfigList([new_config])
try:
# Updates the config for the given role config group.
res = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list)
pprint(res)
except ApiException as e:
print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)If you have questions about anything, let us know.
Created 07-19-2019 04:23 PM
The trick is knowing which structure stores the values. In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:
(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full
You can see the configuration:
{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},
See the comparable command for the REST call above:
Cloudera Manager the following:
In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type. The default value is: 10000000
Via Python, I used the following to ad a "value" that was not None:
#!/usr/bin/env python
import cm_client
from cm_client.rest import ApiException
from pprint import pprint
# Configure HTTP basic authorization: basic
cm_client.configuration.username = 'admin'
cm_client.configuration.password = 'lizard'
cm_client.configuration.verify_ssl = True
# Path of truststore file in PEM
cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem'
# Create an instance of the API class
api_host = 'https://host-10-17-100-224.coe.cloudera.com'
port = '7183'
api_version = 'v30'
# Construct base URL for API
# http://cmhost:7180/api/v30
api_url = api_host + ':' + port + '/api/' + api_version
api_client = cm_client.ApiClient(api_url)
cluster_api_instance = cm_client.ClustersResourceApi(api_client)
# This lists all clusters, so if you have more than one, set "cluster_name" to it. Otherwise, last one wins
api_response = cluster_api_instance.read_clusters(view='SUMMARY')
for cluster in api_response.items:
cluster_name = cluster.name
# Get a list of all services and set myserive if the service type is "YARN"
services_api_instance = cm_client.ServicesResourceApi(api_client)
services = services_api_instance.read_services(cluster.name, view='FULL')
for service in services.items:
if service.type == 'YARN':
myservice = service
# Get a list of roles and get the GATEWAY type
roles_api_instance = cm_client.RolesResourceApi(api_client)
roles = roles_api_instance.read_roles(cluster_name, myservice.name)
for role in roles.items:
if role.type == 'GATEWAY':
myrole = role
# default is summary... this is only relevant if printing out information since full shows all values
# regardless if they have a non-default value or not
view = 'full'
# Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups
# Sets yarnGatewayBase to be the name of the role config group we want to update
rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client)
rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name)
pprint(rcg_api_response)
for cfg in rcg_api_response.items:
if cfg.name == "yarn-GATEWAY-BASE":
yarnGatewayBase = cfg.name
# read the config and iterate over each config till we find the one we want
# print out the config before update
yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view)
for config in yarn_gateway_base.items:
if config.related_name == 'mapreduce.job.split.metainfo.maxsize':
print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n"
pprint(config)
print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n"
config_to_update = config.name
# Set message and new config values where "value" is what we want the new value to be
# use update_config() to update the value
message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional)
new_config = cm_client.ApiConfig(name=config_to_update, value="20000000")
new_config_list = cm_client.ApiConfigList([new_config])
try:
# Updates the config for the given role config group.
res = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list)
pprint(res)
except ApiException as e:
print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)If you have questions about anything, let us know.
Created on 07-19-2019 05:08 PM - edited 07-19-2019 05:09 PM
Thank you @bgooley
Created 09-18-2019 09:29 AM
Okaee.. but what is the trick, as far as i see - it's just a dumb way of finding manually which structure stores what config. Not handy for automating. Is there any api which can tell which config belongs to which structure ?