Created on 11-04-2015 04:59 AM - edited 09-16-2022 02:47 AM
After studying the basics on Java GC, it seems like the Serial (default) GC would be best for YARN containers (low core:task ratio), and CMS or G1 would be best for long-running services that occupy more memory (master services and some edge servers). Are these assumptions valid?
What is recommended for worker services? Is there any situation in the HDP ecosystem where it's recommended to start with ParallelGC or ParallelOldGC?
I still hear of people using CMS, but it looks like that is replaced in favor of G1 as of Java 7+. Is there any reason to choose CMS over G1 when the latter is available?
Are there additional garbage collectors worth learning about, beyond: Serial, Parallel, ParallelOld, CMS, and G1?
Created 02-03-2016 07:59 PM
I'm most familiar with GC tuning for HDFS, so I'll answer from that perspective.
As you expected, our recommendation for the HDFS daemons is CMS. In practice, we have found that some of the default settings for CMS are sub-optimal for the NameNode's heap usage pattern. In addition to enabling CMS, we recommend tuning a few of those settings.
I agree that G1 would be good to evaluate as the future direction. As of right now, we have not tested and certified with G1, so I can't recommend using it.
For more details, please refer to the NameNode garbage collection deep dive article that I just posted.
Created 01-13-2016 03:45 PM
@Alex Miller did you ever get an answer to this question outside the forum?
Created 01-13-2016 04:42 PM
I discovered an internal doc by @Chris Nauroth that provides best practices and troubleshooting tips. Perhaps he would like to share it as a KB when time permits.
Created 02-03-2016 07:56 PM
@Alex Miller, that's a great idea. I've just imported that doc as a new article here: https://community.hortonworks.com/articles/14170/namenode-garbage-collection-configuration-best-pra.... .
Created 02-03-2016 07:59 PM
I'm most familiar with GC tuning for HDFS, so I'll answer from that perspective.
As you expected, our recommendation for the HDFS daemons is CMS. In practice, we have found that some of the default settings for CMS are sub-optimal for the NameNode's heap usage pattern. In addition to enabling CMS, we recommend tuning a few of those settings.
I agree that G1 would be good to evaluate as the future direction. As of right now, we have not tested and certified with G1, so I can't recommend using it.
For more details, please refer to the NameNode garbage collection deep dive article that I just posted.