Support Questions

Find answers, ask questions, and share your expertise

Falcon with HA Resource Manager

avatar
Master Guru

I have seen the question for HA Namenodes however HA Resource Managers still confuse me. In Hue you are for example told to add a second resource manager entry with the same logical hue name. I.e. Hue supports adding two resource manager urls and he will manually try both.

How does that work in Falcon, how can I enter an HA Resource Manager entry into the interfaces of the cluster Entity document. For Namenode HA I would use the logical name and the program would then read the hdfs-site.xml

I have seen the other similar questions for oozie but I am not sure it was answered or I didn't really understand it.

https://community.hortonworks.com/questions/2740/what-value-should-i-use-for-jobtracker-for-resourc....

so assuming my active resource manager is

mycluster1.com:8050

and standby is

mycluster2,com:8050
1 ACCEPTED SOLUTION

avatar
Master Mentor
@Benjamin Leonhard

RM works on the premise that even if you specify the standby RM URL, it will know to redirect to the active RM.

View solution in original post

6 REPLIES 6

avatar
Master Mentor
@Benjamin Leonhard

RM works on the premise that even if you specify the standby RM URL, it will know to redirect to the active RM.

avatar
Master Guru

ah nice undercover magic. I will try and see what happens if I switch the active off.

avatar
Master Guru

@Benjamin Leonhardi did this work as expected? if we simply select one of the RMs in falcon, will it fail over automatically to secondary? I am trying to understand the impact if Falcon pointing to RM and that RM goes down.

avatar
Master Guru

@Sunile Manjee yeah it works Alex tested it as well. Falcon does not point to one RM. It goes to the yarn-site.xml and finds the RM couple that has one of the ones you specified. Then it tries both.

avatar
Master Mentor

Client, ApplicationMaster and NodeManager on RM failover

When there are multiple RMs, the configuration (yarn-site.xml) used by clients and nodes is expected to list all the RMs. Clients, ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to the RMs in a round-robin fashion until they hit the Active RM. If the Active goes down, they resume the round-robin polling until they hit the “new” Active. This default retry logic is implemented as org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider. You can override the logic by implementing org.apache.hadoop.yarn.client.RMFailoverProxyProvider and setting the value of yarn.client.failover-proxy-provider to the class name.

here's more info. @Benjamin Leonhardi

avatar
Master Mentor

@Benjamin Leonhardi are you still having problems with this? Can you provide your own solution or accept best answer?