Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Test HA on ResourceManager

SOLVED Go to solution

Test HA on ResourceManager

New Contributor

Hi guys,

I'm trying to test the HA on the ResourceManager service.

I have 2 instances of the resourceManager.
All jobs work fine, but as soon as I shutdown the first node with active ResourceManager, the cluster enable the second (standby) ResourceManager.

 

When the second ResourceManager is activated i can not start new jobs.


I am forced to restart the resourceManager on the first node.

For information, I have CDH cluster with Openldap and kerberos server.

Regards,

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Test HA on ResourceManager

New Contributor

Hi guys,

 

I found solution. The HA is not operational with : CDH5 / Kerberos / Isilon.

EMC confirm bug but not found solution.

 

 

6 REPLIES 6

Re: Test HA on ResourceManager

Super Collaborator

What is the exception / error message shown in the logs when you try to start a new job ?

 

Re: Test HA on ResourceManager

New Contributor

hello Mathieu,

 

I join log message.

 

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/jars/hive-common-1.1.0-cdh5.8.0.jar!/hive-log4j.properties
OK
Time taken: 0.51 seconds
Query ID = richard_20170228204141_1d3b3cdd-b064-4b00-80c3-2c42d7bf1a16
Total jobs = 5
Launching Job 1 out of 5
Number of reduce tasks not specified. Estimated from input data size: 17
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1488310804413_0001 to YARN : Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: zone.isilon.datalan.lan:8020, Ident: (HDFS_DELEGATION_TOKEN token 0 for richard)
        at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:244)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:573)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564)
        at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:430)
        at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1115)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318)
        at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:416)
        at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:432)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:726)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1488310804413_0001 to YARN : Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: zone.isilon.datalan.lan:8020, Ident: (HDFS_DELEGATION_TOKEN token 0 for richard)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:257)
        at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:290)
        at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
        ... 38 more

Re: Test HA on ResourceManager

Champion
The YARN service is unable to get the HDFS Delegation token on behalf of the users.

What values do you have for the below settings

hadoop.proxyuser.yarn.hosts
hadoop.proxyuser.yarn.groups
hadoop.proxyuser.mapred.hosts
hadoop.proxyuser.mapred.groups

Re: Test HA on ResourceManager

New Contributor

I'm not find this value on cloudera manager and is not define in core-site.xml.

Re: Test HA on ResourceManager

New Contributor

Hi guys,

 

I found solution. The HA is not operational with : CDH5 / Kerberos / Isilon.

EMC confirm bug but not found solution.

 

 

Highlighted

Re: Test HA on ResourceManager

This issue was resolved in OneFS 8.0.0.4.

 

See the release notes for the Resolved Issue.

 

During failover to a secondary ResourceManager, HDFS MapReduce jobs might have been disrupted. This could
have occurred because, during failover, OneFS renegotiated the connection to the ResourceManager using the
same Kerberos ticket but with a different name. As a result, the request to connect to the secondary
ResourceManager could not be authenticated and access was denied.181448