Reply
Explorer
Posts: 23
Registered: ‎11-29-2016

ResourceManager Memory Leak?

We got two CDH cluster with the same version(CDH-5.5.2-1.cdh5.5.2.p0.4), and both the ResourceManager of each cluster with the same configuration.

One of the ResourceManager is running well, and its heap memory is stay in a constant value(e.g 800mb) as the time is going on.

But the other one will throw OOM exception and exit after 15 days. When we use 'jmap -F -histo' to dump its jvm heap info, we are seeing that the size of object 'char[]' is growing up as the time is moving, and it finally throw OOM.

Following is key info of jvm dump result of both the good RM and OOM RM:

dump cmd:jmap -F -histo pid (The heap size of both the RM are 1GB)

 

A)jvm dump of good RM in cluster A

we are seeing that 40w+ char[] instances with 60m+ heap mem

jvm_heap_ok.png

 

 

B)jvm dump of bak RM(OOM) in cluster B

we are seeing that 30w+ char[] instances but with 400m+ heap mem

jvm_heap_notok.png

 

Any help wil be appreciated.

Explorer
Posts: 23
Registered: ‎11-29-2016

Re: ResourceManager Memory Leak?

We dump(jmap -F -dump:file=file.dump_result pid) heap info today, and use MAT(memory analyzer tools) to analyse the dump file, we found that the instance variable applications(java.util.concurrent.ConcurrentHashMap) in org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext eats up a lot of memory:

 

currentHashMap_1.pngcurrentHashMap_2.png

Highlighted
New Contributor
Posts: 1
Registered: ‎07-27-2017

Re: ResourceManager Memory Leak?

Have you solved this problem, We just encountered the same problem...

Announcements