Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cloudera manager API modifiy service HBASE

avatar
Explorer

Hi

I am using the API of Cloudera Manager.

I am trying to modify an attribute from the Hbase service.

It is Hbase_service_config_safety_valve. 

When a create a new ApiConfig object to update the old one.

I use the ApiService class and the update_config to update the attribute.

But when I put the dictionnary in the update_config function  I have this error :

Traceback (most recent call last):
  File "rech.py", line 36, in <module>
    hbase.update_config(m[0])
  File "/usr/local/lib/python2.7/dist-packages/cm_api/endpoints/services.py", line 575, in update_config
    resp = self._get_resource_root().put(path, data = json.dumps(data))
  File "/usr/lib/python2.7/json/__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <cm_api.endpoints.types.ApiConfig object at 0x7f0287020ed0> is not JSON serializable

Below my code : 

# Get a list of all clusters
cdh5 = None
for c in api.get_all_clusters():
#  print c.name 
  if c.version == "CDH5":
    cdh5 = c


for s in cdh5.get_all_services():
 # print s
  if s.type == "HBASE":
    hbase = s


m = hbase.get_config(view='full')

j = ApiConfig(api,name='hbase_service_config_safety_valve',value='<property><name>regionserver.global.memstore.upperLimit</name><value>0.15</value></property>')
m[0]['hbase_service_config_safety_valve'] = j
hbase.update_config(m[0])

Thank you very much

 

1 ACCEPTED SOLUTION

avatar
Explorer

 

I am using cloudera manager to handle my cluster.

 

I found my problem. It was that I wanted to update a parameter that had already been configured by Cloudera manager team and that is a constant value. 

 

Cloudera manager doesn't allow to update some parameter like : 

io.storefile.bloom.block.size

 

and the others constant parameters  you cand find here : https://www.cloudera.com/documentation/other/shared/CDH5-Beta-2-RNs/hbase_jdiff_report-p-cdh4.5-c-cd...

 

So my problem is solved.

 

Thank you very much for your help.

 

 

View solution in original post

6 REPLIES 6

avatar
Super Collaborator

I think what you need is the following

 

hbase_config= { 'hbase_service_config_safety_valve': '<property><name>regionserver.global.memstore.upperLimit</name><value>0.15</value></property>' }

hbase.update_config(hbase_config)

I have tested it and it works. There is no need to set "m" with get_config and then re-apply the whole configuration back. You only have to update the specific safety_valve and not all hbase config.

 

The only catch is that you have to apply all hbase_service_config_safety_valve parameters. The same will happen with your approach because you are not updating only the "regionserver.global.memstore.upperLimit" but you are applying a value to safety valve.

Of course you can write additional code, to parse the existing config of safety valve (the xml part) and add or update the regionserver.global.memstore.upperLimit but again change it to json format as per my example.

avatar
Explorer

Thank you 

 

I tried the same it work but when I want to do the same for the following parameters I have a error.

That say me : unknown parameters. 

 

"hbase.rs.cacheblocksonwrite"
"hbase.storescanner.parallel.seek.threads"
"hbase.storescanner.parallel.seek.enable"
"hbase.regionserver.hlog.blocksize"
"hbase.client.max.perregion.tasks" 
"hbase.client.max.perserver.tasks" 
"hbase.ipc.server.callqueue.handler.factor"
"hbase.ipc.server.callqueue.read.ratio"
"hbase.ipc.server.callqueue.scan.ratio"
"io.storefile.bloom.block.size"
"ycsb.client.threads" 
"hfile.index.block.max.size"
"hfile.block.bloom.cacheonwrite"
"hfile.block.index.cacheonwrite"

Could you please test it in your cluster ? 

I think some parameters cannot be change in this way... I am not sure

 

Thank you very much

avatar
Master Guru

Hi @yassine24,

 

Please share exactly what you tried to do in your code and the exact error, stack, and/or JSON response so we can help.

avatar
Explorer

Hi, 

 

I am trying to implement this paper : ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning

(https://ieeexplore.ieee.org/document/7950900/)

 

For this I have to collect a training set. In the section III, It show the structure of the training set.

And in the begin au page 5, it explains how to generate the data for the training set.

 

To build the models, we need to construct a training set S.
S is a matrix, with each row being the following vector:
vj = [perfj, cij, . . . , cij, . . . , cnj ], j = 1, . . . , m, (1)
with vj the jth observation, perfj the throughput or latency,
and cij the ith HBase configuration parameter of the jth observation. n is the total number of HBase configuration
parameters, and m is the total number of vectors in matrix S (observations or training examples).

 

 

I need to generate random value for parameters showed in the table in the page 8.

 

I used API of Cloudera manager. I go through the hbase service to try to update the parameters with python code.

 

But they are some parameters that they aren't int the Hbase service like :

"hbase.rs.cacheblocksonwrite"
"hbase.storescanner.parallel.seek.threads"
"hbase.storescanner.parallel.seek.enable"
"hbase.regionserver.hlog.blocksize"
"hbase.client.max.perregion.tasks" 
"hbase.client.max.perserver.tasks" 
"hbase.ipc.server.callqueue.handler.factor"
"hbase.ipc.server.callqueue.read.ratio"
"hbase.ipc.server.callqueue.scan.ratio"
"io.storefile.bloom.block.size"
"ycsb.client.threads" 
"hfile.index.block.max.size"
"hfile.block.bloom.cacheonwrite"
"hfile.block.index.cacheonwrite"

Maybe not all of them but a majority of them. 

So I wonder if it's the fact that my architecture doesn't use this parameters or my code doesn't work.

 

below my code :

# Get a list of all clusters
cdh5 = None
for c in api.get_all_clusters():
#  print c.name 
  if c.version == "CDH5":
    cdh5 = c


for s in cdh5.get_all_services():
 # print s
  if s.type == "HBASE":
    hbase = s

nn = None
for r in hbase.get_all_services():
  print r
  if r.type == 'HBASE':
    nn = r

print "Role name: %s\nState: %s\nHealth: %s\nHost: %s" % (
    nn.name, nn.roleState, nn.healthSummary, nn.hostRef.hostId)

t = r.get_config(view='full')
for key, value in t.iteritems():
        print(key)
        print(value)

nn.update_config(
{
'io_storefile_bloom_block_size':True
 #the others parameters with the value
}

)

Thank you 

I hope i was clear

 

 

avatar
Master Guru

@yassine24,

 

io.storefile.bloom.block.size requires an integer, not a boolean.

 

The background is good, but I'm not sure what problem you are seeing when you try to update the configuration.

avatar
Explorer

 

I am using cloudera manager to handle my cluster.

 

I found my problem. It was that I wanted to update a parameter that had already been configured by Cloudera manager team and that is a constant value. 

 

Cloudera manager doesn't allow to update some parameter like : 

io.storefile.bloom.block.size

 

and the others constant parameters  you cand find here : https://www.cloudera.com/documentation/other/shared/CDH5-Beta-2-RNs/hbase_jdiff_report-p-cdh4.5-c-cd...

 

So my problem is solved.

 

Thank you very much for your help.