I am currently going through the HDP FastTrack labs, more precisely "Lab 11: Managing YARN containers and queues"
Basically, you run 2 concurrent example yarn jobs to understand the behavior of the minimum container size and the ability to run multiple applications concurrently, depending on container availability.
Job: yarn jar hadoop-mapreduce-examples.jar pi 5 10
The lab asks you to set the min. container size to be 60% of the maximum container size
Before: Min. Container Size: 2560MB, Max Container Size: 10GB
After: Min. Container Size: 5888MB, Max Container Size: 10GB
After changing the size, Ambari, recommends updating a series of config settings for mapreduce2 service which you have to accept. Then you restart all affected services and resubmit the 2 concurrent applications. The expected result is that one goes into running state and completes; then the 2nd finally gets resources and runs afterwards (unlike before where both would be able to run at the same time.
ALL GOOD FOR NOW....
After you complete that exercise, you are asked to revert the changes to the YARN service through the config history and restart all affected services.
For my own sanity, I resubmitted the 2 concurrent apps and the behavior was not reverted to before i made the changes. My expectation was that all config changes done during the initial change would be reverted. However, the changes to MR2 configuration dont seem to be linked.
After some thinking, I went ahead and reverted the config for MR2 service, restarted all affected, resubmitted concurrent yarn apps...and bingo, the original behavior returned.
Am I incorrect to expect configuration changes to holistically revert across services? This could be dangerous if you dont track what other service configs got changed by an initial change. Is there a better way?
The changes saved in Ambari are done on a per service basis. This allows you to easily revert the configuration of any given service change to a previous setting. Unfortunately, there is no current mechanism to atomically revert related changes.
I agree that automated configuration changes suggested by Ambari should be captured somehow to enable easily undoing those changes.
Hi @Patrick Picard, yes it's definitely a good idea to track changes caused by Ambari advices. In this case there is an explanation: when you increased min. yarn container size to 5888m, your Mapper and Reducer size in MR2 were most still on 2560m, so when your MR's app AM asks for 2560m for a mapper it would have been given 5888m, thus wasting memory, because the Mapper would use only 2560m. [By the way, you didn't have to accept those advices by Ambari, you could uncheck all or some values and say OK.] So, after accepting those recommendations, and when reverting min Yarn container size, Ambari didn't discover anything conflicting and haven't produced any new advices. By the way, those advices/recommendations are handled by the Ambari StackAdvisor, which is usually updated for every new version of Ambari.
Thanks guys. I guess a feature request could be done to have the config reversal give you a hint of the other config that could/should be reversed.
This was a fun troubleshooting exercise 🙂