Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Uneven allocation of containers causing high load avg

Uneven allocation of containers causing high load avg

New Contributor

Hello HCC,

We recently upgraded our prod and all dev cluster from HDP 2.5.3.0 to HDP 2.6.1.0, as we are observing weird behavior in HDP 2.6.1.0 some of the nodes are getting very high allocation of containers causing very high avg load on the server and that is causing nodes to go in heart beat lost state.

When the nodes got very high avg load NN making those nodes as Dead nodes where as RM still keep on assigning containers ( we know that both RM and NN work independently) on that node and all those containers are causing jobs to go in failed state. Every time when we having this issue we are asking our SA team to reboot those servers to alleviate the issue, we didn't had this behavior with HDP 2.5.3.0.

Please find the screenshot for reference where nodes got very high no.of containers and load avg

Present versions : HDP 2.6.1.0 and Ambari 2.5.2.0

@Kuldeep Kulkarni @Jay SenSharma @Artem Ervits @ssathish

ss-1.pngss-2.pngss-3.pngss-4.png


ss-1.png
1 REPLY 1
Highlighted

Re: Uneven allocation of containers causing high load avg

New Contributor

Log messages from one of the server where we are observing this behavior

2017-10-27 21:20:43,991 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 10059 for container-id container_e746_1508665985104_313505_01_002805: -1B of 4 GB physical memory used; -1B of 8.4 GB virtual memory used
2017-10-27 21:20:44,049 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 108356 for container-id container_e746_1508665985104_313505_01_002168: 1.2 MB of 4 GB physical memory used; 103.6 MB of 8.4 GB virtual memory used
2017-10-27 21:20:44,105 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 13033 for container-id container_e746_1508665985104_304789_01_002499: -1B of 4 GB physical memory used; -1B of 8.4 GB virtual memory used
~


Don't have an account?
Coming from Hortonworks? Activate your account here