- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Test HA on ResourceManager
- Labels:
-
Apache YARN
-
Cloudera Manager
-
Kerberos
Created on 02-28-2017 12:08 PM - edited 09-16-2022 04:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I'm trying to test the HA on the ResourceManager service.
I have 2 instances of the resourceManager.
All jobs work fine, but as soon as I shutdown the first node with active ResourceManager, the cluster enable the second (standby) ResourceManager.
When the second ResourceManager is activated i can not start new jobs.
I am forced to restart the resourceManager on the first node.
For information, I have CDH cluster with Openldap and kerberos server.
Regards,
Created 03-12-2017 04:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I found solution. The HA is not operational with : CDH5 / Kerberos / Isilon.
EMC confirm bug but not found solution.
Created 03-01-2017 04:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is the exception / error message shown in the logs when you try to start a new job ?
Created 03-01-2017 10:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello Mathieu,
I join log message.
Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/jars/hive-common-1.1.0-cdh5.8.0.jar!/hive-log4j.properties OK Time taken: 0.51 seconds Query ID = richard_20170228204141_1d3b3cdd-b064-4b00-80c3-2c42d7bf1a16 Total jobs = 5 Launching Job 1 out of 5 Number of reduce tasks not specified. Estimated from input data size: 17 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1488310804413_0001 to YARN : Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: zone.isilon.datalan.lan:8020, Ident: (HDFS_DELEGATION_TOKEN token 0 for richard) at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:244) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:573) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:430) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1115) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:416) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:432) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:726) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1488310804413_0001 to YARN : Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: zone.isilon.datalan.lan:8020, Ident: (HDFS_DELEGATION_TOKEN token 0 for richard) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:257) at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:290) at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) ... 38 more
Created 03-01-2017 01:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What values do you have for the below settings
hadoop.proxyuser.yarn.hosts
hadoop.proxyuser.yarn.groups
hadoop.proxyuser.mapred.hosts
hadoop.proxyuser.mapred.groups
Created 03-06-2017 09:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not find this value on cloudera manager and is not define in core-site.xml.
Created 03-12-2017 04:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I found solution. The HA is not operational with : CDH5 / Kerberos / Isilon.
EMC confirm bug but not found solution.
Created 05-04-2017 12:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue was resolved in OneFS 8.0.0.4.
See the release notes for the Resolved Issue.
During failover to a secondary ResourceManager, HDFS MapReduce jobs might have been disrupted. This could
have occurred because, during failover, OneFS renegotiated the connection to the ResourceManager using the
same Kerberos ticket but with a different name. As a result, the request to connect to the secondary
ResourceManager could not be authenticated and access was denied.181448