Ambari-2.1.2,HDP-18.104.22.168-2950 Noticed to many resident memory for the agents on a cluster that is running for few day.
I found a solution. https://community.hortonworks.com/questions/21253/ambari-agent-memory-leak-or-taking-too-much-memory.html
https://issues.apache.org/jira/browse/AMBARI-17539 I have modified the code for main.py, but the agent still has memory leak. The following is the code I added def fix_subprocess_racecondition():
Subprocess in Python has race condition with enabling/disabling gc. Which may lead to turning off python garbage collector.
This leads to a memory leak.
This function monkey patches subprocess to fix the issue.
!!! PLEASE NOTE THIS SHOULD BE CALLED BEFORE ANY OTHER INITIALIZATION was done to avoid already created links to subprocess or subprocess.gc or gc
# monkey patching subprocess
subprocess.gc.isenabled = lambda: True
# re-importing gc to have correct isenabled for non-subprocess contexts
Workaround for race condition in starting subprocesses concurrently from
multiple threads via the subprocess and multiprocessing modules.
See http://bugs.python.org/issue19809 for details and repro script.
if os.name == 'posix' and sys.version_info < 3:
from multiprocessing import forking
sp_original_init = subprocess.Popen.__init__
mp_original_init = forking.Popen.__init__
lock = threading.RLock() # guards subprocess creation
def sp_locked_init(self, *a, **kw):
sp_original_init(self, *a, **kw)
def mp_locked_init(self, *a, **kw):
mp_original_init(self, *a, **kw)
subprocess.Popen.__init__ = sp_locked_init
forking.Popen.__init__ = mp_locked_init
... View more