Created on 07-18-2018 09:46 AM - edited 09-16-2022 06:28 AM
Hi
I am using the API of Cloudera Manager.
I am trying to modify an attribute from the Hbase service.
It is Hbase_service_config_safety_valve.
When a create a new ApiConfig object to update the old one.
I use the ApiService class and the update_config to update the attribute.
But when I put the dictionnary in the update_config function I have this error :
Traceback (most recent call last): File "rech.py", line 36, in <module> hbase.update_config(m[0]) File "/usr/local/lib/python2.7/dist-packages/cm_api/endpoints/services.py", line 575, in update_config resp = self._get_resource_root().put(path, data = json.dumps(data)) File "/usr/lib/python2.7/json/__init__.py", line 244, in dumps return _default_encoder.encode(obj) File "/usr/lib/python2.7/json/encoder.py", line 207, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode return _iterencode(o, 0) File "/usr/lib/python2.7/json/encoder.py", line 184, in default raise TypeError(repr(o) + " is not JSON serializable") TypeError: <cm_api.endpoints.types.ApiConfig object at 0x7f0287020ed0> is not JSON serializable
Below my code :
# Get a list of all clusters cdh5 = None for c in api.get_all_clusters(): # print c.name if c.version == "CDH5": cdh5 = c for s in cdh5.get_all_services(): # print s if s.type == "HBASE": hbase = s m = hbase.get_config(view='full') j = ApiConfig(api,name='hbase_service_config_safety_valve',value='<property><name>regionserver.global.memstore.upperLimit</name><value>0.15</value></property>') m[0]['hbase_service_config_safety_valve'] = j hbase.update_config(m[0])
Thank you very much
Created 07-27-2018 01:53 AM
I am using cloudera manager to handle my cluster.
I found my problem. It was that I wanted to update a parameter that had already been configured by Cloudera manager team and that is a constant value.
Cloudera manager doesn't allow to update some parameter like :
io.storefile.bloom.block.size
and the others constant parameters you cand find here : https://www.cloudera.com/documentation/other/shared/CDH5-Beta-2-RNs/hbase_jdiff_report-p-cdh4.5-c-cd...
So my problem is solved.
Thank you very much for your help.
Created 07-19-2018 03:41 AM
I think what you need is the following
hbase_config= { 'hbase_service_config_safety_valve': '<property><name>regionserver.global.memstore.upperLimit</name><value>0.15</value></property>' } hbase.update_config(hbase_config)
I have tested it and it works. There is no need to set "m" with get_config and then re-apply the whole configuration back. You only have to update the specific safety_valve and not all hbase config.
The only catch is that you have to apply all hbase_service_config_safety_valve parameters. The same will happen with your approach because you are not updating only the "regionserver.global.memstore.upperLimit" but you are applying a value to safety valve.
Of course you can write additional code, to parse the existing config of safety valve (the xml part) and add or update the regionserver.global.memstore.upperLimit but again change it to json format as per my example.
Created 07-20-2018 03:25 AM
Thank you
I tried the same it work but when I want to do the same for the following parameters I have a error.
That say me : unknown parameters.
"hbase.rs.cacheblocksonwrite" "hbase.storescanner.parallel.seek.threads" "hbase.storescanner.parallel.seek.enable" "hbase.regionserver.hlog.blocksize" "hbase.client.max.perregion.tasks" "hbase.client.max.perserver.tasks" "hbase.ipc.server.callqueue.handler.factor" "hbase.ipc.server.callqueue.read.ratio" "hbase.ipc.server.callqueue.scan.ratio" "io.storefile.bloom.block.size" "ycsb.client.threads" "hfile.index.block.max.size" "hfile.block.bloom.cacheonwrite" "hfile.block.index.cacheonwrite"
Could you please test it in your cluster ?
I think some parameters cannot be change in this way... I am not sure
Thank you very much
Created 07-20-2018 04:25 PM
Hi @yassine24,
Please share exactly what you tried to do in your code and the exact error, stack, and/or JSON response so we can help.
Created 07-23-2018 12:25 PM
Hi,
I am trying to implement this paper : ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
(https://ieeexplore.ieee.org/document/7950900/)
For this I have to collect a training set. In the section III, It show the structure of the training set.
And in the begin au page 5, it explains how to generate the data for the training set.
To build the models, we need to construct a training set S. S is a matrix, with each row being the following vector: vj = [perfj, cij, . . . , cij, . . . , cnj ], j = 1, . . . , m, (1) with vj the jth observation, perfj the throughput or latency, and cij the ith HBase configuration parameter of the jth observation. n is the total number of HBase configuration parameters, and m is the total number of vectors in matrix S (observations or training examples).
I need to generate random value for parameters showed in the table in the page 8.
I used API of Cloudera manager. I go through the hbase service to try to update the parameters with python code.
But they are some parameters that they aren't int the Hbase service like :
"hbase.rs.cacheblocksonwrite" "hbase.storescanner.parallel.seek.threads" "hbase.storescanner.parallel.seek.enable" "hbase.regionserver.hlog.blocksize" "hbase.client.max.perregion.tasks" "hbase.client.max.perserver.tasks" "hbase.ipc.server.callqueue.handler.factor" "hbase.ipc.server.callqueue.read.ratio" "hbase.ipc.server.callqueue.scan.ratio" "io.storefile.bloom.block.size" "ycsb.client.threads" "hfile.index.block.max.size" "hfile.block.bloom.cacheonwrite" "hfile.block.index.cacheonwrite"
Maybe not all of them but a majority of them.
So I wonder if it's the fact that my architecture doesn't use this parameters or my code doesn't work.
below my code :
# Get a list of all clusters cdh5 = None for c in api.get_all_clusters(): # print c.name if c.version == "CDH5": cdh5 = c for s in cdh5.get_all_services(): # print s if s.type == "HBASE": hbase = s nn = None for r in hbase.get_all_services(): print r if r.type == 'HBASE': nn = r print "Role name: %s\nState: %s\nHealth: %s\nHost: %s" % ( nn.name, nn.roleState, nn.healthSummary, nn.hostRef.hostId) t = r.get_config(view='full') for key, value in t.iteritems(): print(key) print(value) nn.update_config( { 'io_storefile_bloom_block_size':True #the others parameters with the value } )
Thank you
I hope i was clear
Created 07-24-2018 11:32 AM
io.storefile.bloom.block.size requires an integer, not a boolean.
The background is good, but I'm not sure what problem you are seeing when you try to update the configuration.
Created 07-27-2018 01:53 AM
I am using cloudera manager to handle my cluster.
I found my problem. It was that I wanted to update a parameter that had already been configured by Cloudera manager team and that is a constant value.
Cloudera manager doesn't allow to update some parameter like :
io.storefile.bloom.block.size
and the others constant parameters you cand find here : https://www.cloudera.com/documentation/other/shared/CDH5-Beta-2-RNs/hbase_jdiff_report-p-cdh4.5-c-cd...
So my problem is solved.
Thank you very much for your help.