Support Questions

Find answers, ask questions, and share your expertise

HAWQ Yarn containers are running forever not getting killed due to sleep timing


I am working with a HAWQ installation integrated with YARN resource manager. I have noticed that YARN container process are not getting killed after finishing the query:

# ps -ef|grep -i postgres

postgres 477822 238279 0 05:36 ? 00:00:00 /usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor postgres postgres 1 application_1496358601894_1103 container_1496358601894_1103_01_000139 /hd_data/disk1/hadoop/yarn/local/usercache/postgres/appcache/application_1496358601894_1103/container_1496358601894_1103_01_000139 /hd_data/disk1/hadoop/yarn/local/nmPrivate/application_1496358601894_1103/container_1496358601894_1103_01_000139/ /hd_data/disk5/hadoop/yarn/local/nmPrivate/application_1496358601894_1103

postgres 477826 477822 0 05:36 ? 00:00:00 sleep 10000000000

I am not able to identify the property with sleeping time set to 10000000000 .

Anyone familiar with this kinds of issue?



You're probably looking at HAWQ processes that are used to cache YARN containers for a few minutes after query completes in order to avoid a new RPC call to YARN RM for subsequent queries. This is configurable, see this documentation:

It's not clear to me at the moment what the second process is, not sure if it is the postmaster process on port 5432, doesn't look like it. You can compare the processes by going from YARN mode to Standalone mode temporarily via Ambari and comparing.