- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Guidelines for initial garbage collection settings in HDP?
Created on ‎11-04-2015 04:59 AM - edited ‎09-16-2022 02:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After studying the basics on Java GC, it seems like the Serial (default) GC would be best for YARN containers (low core:task ratio), and CMS or G1 would be best for long-running services that occupy more memory (master services and some edge servers). Are these assumptions valid?
What is recommended for worker services? Is there any situation in the HDP ecosystem where it's recommended to start with ParallelGC or ParallelOldGC?
I still hear of people using CMS, but it looks like that is replaced in favor of G1 as of Java 7+. Is there any reason to choose CMS over G1 when the latter is available?
Are there additional garbage collectors worth learning about, beyond: Serial, Parallel, ParallelOld, CMS, and G1?
Created ‎02-03-2016 07:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm most familiar with GC tuning for HDFS, so I'll answer from that perspective.
As you expected, our recommendation for the HDFS daemons is CMS. In practice, we have found that some of the default settings for CMS are sub-optimal for the NameNode's heap usage pattern. In addition to enabling CMS, we recommend tuning a few of those settings.
I agree that G1 would be good to evaluate as the future direction. As of right now, we have not tested and certified with G1, so I can't recommend using it.
For more details, please refer to the NameNode garbage collection deep dive article that I just posted.
Created ‎01-13-2016 03:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Alex Miller did you ever get an answer to this question outside the forum?
Created ‎01-13-2016 04:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I discovered an internal doc by @Chris Nauroth that provides best practices and troubleshooting tips. Perhaps he would like to share it as a KB when time permits.
Created ‎02-03-2016 07:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Alex Miller, that's a great idea. I've just imported that doc as a new article here: https://community.hortonworks.com/articles/14170/namenode-garbage-collection-configuration-best-pra.... .
Created ‎02-03-2016 07:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm most familiar with GC tuning for HDFS, so I'll answer from that perspective.
As you expected, our recommendation for the HDFS daemons is CMS. In practice, we have found that some of the default settings for CMS are sub-optimal for the NameNode's heap usage pattern. In addition to enabling CMS, we recommend tuning a few of those settings.
I agree that G1 would be good to evaluate as the future direction. As of right now, we have not tested and certified with G1, so I can't recommend using it.
For more details, please refer to the NameNode garbage collection deep dive article that I just posted.
