Created on 07-18-2019 05:33 AM - edited 09-16-2022 07:31 AM
I just started using cm_client python for swagger v32. I wonder if I missed something because I found some simple task seemed quite hard to accomplish as a new API user.
For example, I am trying to use the API to set a Yarn configuration mapreduce.job.split.metainfo.maxsize. In CM GUI I can go to YARN - Configuration - search for such a setting and then update it, quite simple. However, in order to do it in API, I was not sure which "update_config" I should use because there are many of them. I tried "serviceResourceAPi.update_service_config( )" for yarn service but the specific setting was not there. then I thought I should try update_role_config or update_role_config_group. Then I realized I had to inpect the many "read_config" output manually to find out which API may be relavent to the parameter. I could not do it programly because I was not sure how the setting/parameter would be called in API - maybe mapreduce.job.split.metainfo.maxsize would be called mapreduce_job_split_metainfo_max_size or something else.
Am I on a wrong path? how does savy developers find out which API to use to update each specific parameters?
Thanks
Created 07-19-2019 04:23 PM
The trick is knowing which structure stores the values. In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:
(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full
You can see the configuration:
{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},
See the comparable command for the REST call above:
Cloudera Manager the following:
In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type. The default value is: 10000000
Via Python, I used the following to ad a "value" that was not None:
#!/usr/bin/env python import cm_client from cm_client.rest import ApiException from pprint import pprint # Configure HTTP basic authorization: basic cm_client.configuration.username = 'admin' cm_client.configuration.password = 'lizard' cm_client.configuration.verify_ssl = True # Path of truststore file in PEM cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem' # Create an instance of the API class api_host = 'https://host-10-17-100-224.coe.cloudera.com' port = '7183' api_version = 'v30' # Construct base URL for API # http://cmhost:7180/api/v30 api_url = api_host + ':' + port + '/api/' + api_version api_client = cm_client.ApiClient(api_url) cluster_api_instance = cm_client.ClustersResourceApi(api_client) # This lists all clusters, so if you have more than one, set "cluster_name" to it. Otherwise, last one wins api_response = cluster_api_instance.read_clusters(view='SUMMARY') for cluster in api_response.items: cluster_name = cluster.name # Get a list of all services and set myserive if the service type is "YARN" services_api_instance = cm_client.ServicesResourceApi(api_client) services = services_api_instance.read_services(cluster.name, view='FULL') for service in services.items: if service.type == 'YARN': myservice = service # Get a list of roles and get the GATEWAY type roles_api_instance = cm_client.RolesResourceApi(api_client) roles = roles_api_instance.read_roles(cluster_name, myservice.name) for role in roles.items: if role.type == 'GATEWAY': myrole = role # default is summary... this is only relevant if printing out information since full shows all values # regardless if they have a non-default value or not view = 'full' # Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups # Sets yarnGatewayBase to be the name of the role config group we want to update rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client) rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name) pprint(rcg_api_response) for cfg in rcg_api_response.items: if cfg.name == "yarn-GATEWAY-BASE": yarnGatewayBase = cfg.name # read the config and iterate over each config till we find the one we want # print out the config before update yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view) for config in yarn_gateway_base.items: if config.related_name == 'mapreduce.job.split.metainfo.maxsize': print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n" pprint(config) print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n" config_to_update = config.name # Set message and new config values where "value" is what we want the new value to be # use update_config() to update the value message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional) new_config = cm_client.ApiConfig(name=config_to_update, value="20000000") new_config_list = cm_client.ApiConfigList([new_config]) try: # Updates the config for the given role config group. res = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list) pprint(res) except ApiException as e: print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)
If you have questions about anything, let us know.
Created 07-19-2019 04:23 PM
The trick is knowing which structure stores the values. In the case of mapreduce.job.split.metainfo.maxsize you can find it via the REST api like this:
(assuming your cluster was named cluster and your YARN service wasnamed yarn)
http://cm_host:7180/api/v32/clusters/cluster/services/yarn/roleConfigGroups/yarn-GATEWAY-BASE/config?view=full
You can see the configuration:
{
"name" : "mapreduce_jobtracker_split_metainfo_maxsize",
"required" : false,
"default" : "10000000",
"displayName" : "JobTracker MetaInfo Maxsize",
"description" : "The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1.",
"relatedName" : "mapreduce.job.split.metainfo.maxsize",
"sensitive" : false,
"validationState" : "OK"
},
See the comparable command for the REST call above:
Cloudera Manager the following:
In the case of "mapreduce.job.split.metainfo.maxsize" it is defined in the default Role Configuration Group for the YARN Gateway Role type. The default value is: 10000000
Via Python, I used the following to ad a "value" that was not None:
#!/usr/bin/env python import cm_client from cm_client.rest import ApiException from pprint import pprint # Configure HTTP basic authorization: basic cm_client.configuration.username = 'admin' cm_client.configuration.password = 'lizard' cm_client.configuration.verify_ssl = True # Path of truststore file in PEM cm_client.configuration.ssl_ca_cert = '/opt/cloudera/security/cacerts/ClouderaSEC_combined.pem' # Create an instance of the API class api_host = 'https://host-10-17-100-224.coe.cloudera.com' port = '7183' api_version = 'v30' # Construct base URL for API # http://cmhost:7180/api/v30 api_url = api_host + ':' + port + '/api/' + api_version api_client = cm_client.ApiClient(api_url) cluster_api_instance = cm_client.ClustersResourceApi(api_client) # This lists all clusters, so if you have more than one, set "cluster_name" to it. Otherwise, last one wins api_response = cluster_api_instance.read_clusters(view='SUMMARY') for cluster in api_response.items: cluster_name = cluster.name # Get a list of all services and set myserive if the service type is "YARN" services_api_instance = cm_client.ServicesResourceApi(api_client) services = services_api_instance.read_services(cluster.name, view='FULL') for service in services.items: if service.type == 'YARN': myservice = service # Get a list of roles and get the GATEWAY type roles_api_instance = cm_client.RolesResourceApi(api_client) roles = roles_api_instance.read_roles(cluster_name, myservice.name) for role in roles.items: if role.type == 'GATEWAY': myrole = role # default is summary... this is only relevant if printing out information since full shows all values # regardless if they have a non-default value or not view = 'full' # Uses RoleConfigGroupsResourceApi to retrieve all YARN role config groups # Sets yarnGatewayBase to be the name of the role config group we want to update rcg_api_instance = cm_client.RoleConfigGroupsResourceApi(api_client) rcg_api_response = rcg_api_instance.read_role_config_groups(cluster_name, myservice.name) pprint(rcg_api_response) for cfg in rcg_api_response.items: if cfg.name == "yarn-GATEWAY-BASE": yarnGatewayBase = cfg.name # read the config and iterate over each config till we find the one we want # print out the config before update yarn_gateway_base = rcg_api_instance.read_config(cluster_name, yarnGatewayBase, myservice.name, view=view) for config in yarn_gateway_base.items: if config.related_name == 'mapreduce.job.split.metainfo.maxsize': print "mapreduce.job.split.metainfo.maxsize before update:\n===================\n" pprint(config) print "mapreduce.job.split.metainfo.maxsize after update:\n====================\n" config_to_update = config.name # Set message and new config values where "value" is what we want the new value to be # use update_config() to update the value message = 'testing update of mapreduce.job.split.metainfo.maxsize' # str | Optional message describing the changes. (optional) new_config = cm_client.ApiConfig(name=config_to_update, value="20000000") new_config_list = cm_client.ApiConfigList([new_config]) try: # Updates the config for the given role config group. res = rcg_api_instance.update_config(cluster_name, yarnGatewayBase, myservice.name, message=message, body=new_config_list) pprint(res) except ApiException as e: print("Exception when calling RoleConfigGroupsResourceApi->update_config: %s\n" % e)
If you have questions about anything, let us know.
Created on 07-19-2019 05:08 PM - edited 07-19-2019 05:09 PM
Thank you @bgooley
Created 09-18-2019 09:29 AM
Okaee.. but what is the trick, as far as i see - it's just a dumb way of finding manually which structure stores what config. Not handy for automating. Is there any api which can tell which config belongs to which structure ?