Reply
Expert Contributor
Posts: 90
Registered: ‎05-09-2017

How to use G1GC for my MR job

[ Edited ]

How can i specify G1GC for my mapreduce job ?

I want to override cluster wide settings. i want tp specify these properties at the job level. 

Posts: 1,903
Kudos: 435
Solutions: 307
Registered: ‎07-31-2013

Re: How to use G1GC for my MR job

If you use CM and wish to apply this for all jobs, add the required JVM opts that enable G1 and other desired flags under:

CM -> YARN (with MR2) -> Configuration -> Field called "Map Task Java Opts Base" and "Reduce Task Java Opts Base".

Save and redeploy your cluster-wide client configuration: https://www.youtube.com/watch?v=4S9H3wftM_0

Outside of CM, you can use -Dkey=value CLI flags to pass to your job driver, provided your job driver uses ToolRunner framework properly as prescribed here: http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/org/apache/hadoop/util/Tool.html, or apply it via code (not suggested, since its a configuration subject to change in future). The config keys to modify for this style would be "mapreduce.map.java.opts" and "mapreduce.reduce.java.opts".

Ensure you also specify the maximum heap size when changing the raw config keys. When done from CM, the heaps are inferred and auto-added from other heap-specific fields in the config page, but this is manual in the second technique presented above.