I am using CM4.8 with CDH4.5.0. Using Mapreduce2 on Yarn.
I'd like to know how to set mapreduce job counter max on YARN.
I search configuration of YARN on CM, and tried to set mapreduce.job.counters.max inside my code, seems doesn't work.
Anyone have ideas?
After making any changes to a Gateway, then you should re-deploy client configuration to make your changes take effect. Gateway configs generally update things in /etc like /etc/hadoop/conf, which is where clients pick up configuration by default. There should also be an icon on the main page indicating that your client configs need to be re-deployed (new in CM5).
It's strange that the service safety valve helps, since this is mapreduce configuration and the ResourceManager and NodeManager don't generally care about mapreduce configuration. Only the MapReduce2 Application Master really cares (which is not a daemon), and this gets configs from clients.
The Job History Server also uses mapreduce.job.counters.max as well, when reading completed jobs, hence why the safety valve appears to work.
Also interesting to note, if the YARN service at one point also had the max counters set high, then removed. (i.e. the JHS tries to read an older job that had many counters that was greater than what the JHS had) reading that file through the JHS Web UI would result in a too many counters error.
The current default that the JHS has for max counters is pulled from mapred-default.xml which is 120, so any jobs that run higher than this should have the safety valve set for the time being.
The largest value for max counters in JHS should be less than or equal to MAX_INT.