03-24-2017 01:17 AM
Can we avoid Resource Manager to retry failed Application Master on the same NodeManager? My hive job tried to lauch application master in same NM after multiple fails, It leads whole job fail after 4 retries.
03-29-2017 10:42 PM
This response I got from the Cloudera support
"I can see you're running on CDH 5.3.3, and this was added as a feature in YARN-2005, which was included in CDH releases starting from CDH 5.5.0: "YARN-2005: Blacklisting support for scheduling AMs" https://issues.apache.org/jira/browse/YARN-2005,
Unfortunately this can't be backported to your version of CDH and you will have to resort to an upgrade "
Another ticket related black listing AMs is
This fix releasing with Hadoop 2.8.0.