Created 06-22-2017 09:53 PM
I need help understanding the YARN ResourceManager Active/Standby behavior.
Setup Context:
In an attempt to better understand how the YARN ResourceManager functions, I stopped the "Active" service to study the log output. A fail-over did not occur, as expected in my statement above. When starting the "Active" service back up, I found the service relabeled as "Standby". Well over 5 minutes have passed and the service remains labeled as "Standby".
At this point, there are two "Standby" services. Only the original "Active" service shows log activity, as expected. The log activity displays a number of Metrics errors:
Would anyone happen to be familiar with such a situation? I greatly appreciate your time and input.
Created 06-23-2017 06:36 PM
Hi @Anthony Seluk,
When you disable High Availability, automatic failover also gets disabled.
Hence when active RM is killed, it doesn't do automatic failover to standby RM [that is, making standby RM as active]. So there will not be any active RM; instead we have two standbyRMs.
In such scenario, we need to do manual failover. More info on manual failover is given below
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
Created 06-23-2017 06:36 PM
Hi @Anthony Seluk,
When you disable High Availability, automatic failover also gets disabled.
Hence when active RM is killed, it doesn't do automatic failover to standby RM [that is, making standby RM as active]. So there will not be any active RM; instead we have two standbyRMs.
In such scenario, we need to do manual failover. More info on manual failover is given below
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html