Created 05-24-2016 04:47 PM
To avoid errors on starting up Node Managers application : "GC Overhead limit exceeded" when yarn.nodemanager.recovery.enabled = true. How do you determine / size Node Manager heap.
Created 05-24-2016 08:11 PM
The best way to find the nodemanager heap size and other memory settings is to calculate it specifically for your cluster size and hardware spec.
Here is the utility that you can use
see this link
Created 05-24-2016 08:18 PM
@Geoffrey Shelton Okot I am specifically looking for heap size for YARN Resource Manager Daemon and Node Manager daemon and not for containers memory allocation sizing.
Created 05-24-2016 08:44 PM
I don't know if we have any calculation to find out the exact heapsize for NM but typically NM doesn't require huge memory, I have seen customers using b/w 2-4GB heapsize is more than sufficient for node having 256g+ physical memory.
What was the heapsize when you saw "GC Overhead limit exceeded" and how many jobs were running to get recovered?
Created 05-24-2016 08:51 PM
heapsize was 1GB. Looks like that is the default. Is there a class to read "yarn-nm-recovery" file to check on how many jobs were running to get recovered ?
Created 05-24-2016 09:32 PM
NM manager stores apps/containers info inside /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery but I'm not aware of any tool which can read these files. Either you can parse RM and NM logs to find the rough idea of containers count. Also I would recommend you to increase the NM heapsize from 1G to 3G and restart the NM service.