Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎03-21-2018

Error setting cgroups memory for add-on service

We install our software component as an add-on service using a parcel and CSD. And we have enabled CGroups with cpu shares and memory hard limit. And when we set static memory allocation we saw the service cgroup created with the right values. This worked fine with Cloudera Manager 5.9.1.

 

We recently upgraded to Cloudera Manager 5.14.1 and the cgroups for memory does not work anymore.

 

  1. When the service is started, we see this error in the cloudera-scm-agent.log.
    20/Mar/2018 02:11:51 +0000] 3086 MainThread process      ERROR    [] Could not evaluate resource {u'hard_limit': 17179869184, u'soft_limit': -1}: 'unicode' object has no attribute 'items'
    Traceback (most recent call last):
      File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.14.1-py2.6.egg/cmf/process.py", line 924, in _iter
        fn(res)
      File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.14.1-py2.6.egg/cmf/process.py", line 873, in _do_mem_resource
        self.agent.cg_manager.configure_group(path, "memory", mem_resource)
      File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.14.1-py2.6.egg/cmf/cgroups.py", line 605, in configure_group
        for resource_name, resource_value in resource.items():

It seems the configure_group function in cgroups.py is expecting a json object with child items, but the passed in json object is flat and has no child items. Hence the error. As a workaround, I edited the _do_mem_resources function in process.py and passed the entire memory_resources object.


self.agent.cg_manager.configure_group(path, "memory", mem_resources)

 

With this change, we did not see the startup error and the cgroup setting was modified at startup, which then exposed issue #2.

 

2. We allocated 64GB for the service per node in the static allocation page. But we found a really large value in the cgroup’s memory.limit_in_bytes value. (72057594037927936)

 

We see the right value 68719476736 bytes shown in the cloudera-scm-agent.log but the final value on the cgroup is this large number 72057594037927936.

 

[21/Mar/2018 04:28:15 +0000] 24060 MainThread cgroups      INFO     Reconfiguring cgroup pseudofile /var/run/cloudera-scm-agent/cgroups/memory/<my service role>/memory.limit_in_bytes with value 68719476736

 

The /usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.14.1-py2.6.egg/cmf/cgroup.py script that writes out the cgroup entry assumes the value is in MB and so multiplies the value by 1024 * 1024 and writes this large value 72057594037927936 to the cgroup file.
 
I then compared the behavior with Cloudera Manager 5.9.x where this was all working.
It looks like in 5.9.x everything was stored internally as MB. When we set 64GB in static allocation page, we see the proc.json had the memory hard limit set to 65536. And then the cgroup.py multiplied it by 1024*1024 to compute the actual bytes.

 

In 5.14.1 of CM, it looks like everything is stored as bytes as we see in the proc.json. But the cgroups.py is still trying to treat the values as MB and convert it to bytes.

 

Are these 2 issues known issues? Is there a patch to address these?

 

Appreciate any help.

 

Cloudera Employee
Posts: 506
Registered: ‎07-30-2013

Re: Error setting cgroups memory for add-on service

Hi Venkat,

 

This is a known issue fixed in upcoming releases 5.14.2, and 5.15+. It is reported in versions as old as CM 5.12.1, though I'm not sure if that's the oldest affected version, but this would explain the issue you're seeing.

 

Thanks,

Darren

Announcements