Created 12-15-2015 02:08 PM
After building out a HDP 2.2 cluster (single node) using blueprint I'm getting the following error around the ResourceManager.
$ less /var/log/hadoop-yarn/yarn/yarn-yarn-resourcemanager-gsc01-ost-tesla-h-hb01.td.local.log STARTUP_MSG: Starting ResourceManager STARTUP_MSG: host = gsc01-ost-tesla-h-hb01.td.local/192.168.106.26 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.6.0.2.2.9.0-3393 ... 2015-12-15 01:01:47,671 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service RMActiveServices failed in state INITED; cause: java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100]. java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100]. at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.internalGetLabeledQueueCapacity(CapacitySchedulerConfiguration.java:465) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getLabeledQueueCapacity(CapacitySchedulerConfiguration.java:477) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadCapacitiesByLabelsFromConf(CSQueueUtils.java:143) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadUpdateAndCheckCapacities(CSQueueUtils.java:122) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupConfigurableCapacities(AbstractCSQueue.java:99) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupQueueConfigs(AbstractCSQueue.java:242) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:109) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.<init>(ParentQueue.java:100) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:589) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:465) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:297) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:326) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:576) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1016) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:269) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1237) 2015-12-15 01:01:47,672 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping ResourceManager metrics system...
My blueprint file is intentionally sparse so I'm only calling out components without setting any configurations unless needed.
{ "host_groups" : [ { "name" : "host_group_1", "configurations" : [ ], "components" : [ { "name" : "ZOOKEEPER_SERVER" }, { "name" : "ZOOKEEPER_CLIENT" }, ... ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.2" }
I suspect this message a bit up in the logs might be related:
2015-12-15 01:01:47,598 INFO conf.Configuration (Configuration.java:getConfResourceAsInputStream(2236)) - found resource capacity-scheduler.xml at file:/etc/hadoop/conf.empty/capacity-scheduler.xml 2015-12-15 01:01:47,663 WARN capacity.CapacitySchedulerConfiguration (CapacitySchedulerConfiguration.java:getAccessibleNodeLabels(433)) - Accessible node labels for root queue will be ignored, it will be automatically set to "*". 2015-12-15 01:01:47,668 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler failed in state INITED; cause: java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100].
Looking in the mentioned .xml file:
<property> <name>yarn.scheduler.capacity.root.accessible-node-labels.default.capacity</name> <value>-1</value> </property> <property> <name>yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity</name> <value>-1</value> </property>
Do I just need to set these in my blueprint file?
NOTE: Here's the full .xml file: capacity-schedulerxml.txt
EDIT #1
I took these 2 properties out of the above .xml file and attempted to restart ResourceManager, but it's still throwing the same exception:
2015-12-15 10:40:51,231 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1241)) - Error starting ResourceManager java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100]. at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.internalGetLabeledQueueCapacity(CapacitySchedulerConfiguration.java:465) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getLabeledQueueCapacity(CapacitySchedulerConfiguration.java:477) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadCapacitiesByLabelsFromConf(CSQueueUtils.java:143) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadUpdateAndCheckCapacities(CSQueueUtils.java:122) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupConfigurableCapacities(AbstractCSQueue.java:99) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupQueueConfigs(AbstractCSQueue.java:242) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:109) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.<init>(ParentQueue.java:100) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:589) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:465) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:297) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:326) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:576) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1016) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:269) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1237) 2015-12-15 10:40:51,233 INFO resourcemanager.ResourceManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
Created 12-15-2015 02:30 PM
This sounds like a bug, and both values are removed in newer Ambari versions.
https://issues.apache.org/jira/browse/AMBARI-13232
Could you remove the following two values and try to restart the RM:
Created 12-15-2015 03:51 PM
Looks like someone on SO is having the same issue as well: http://stackoverflow.com/questions/34054426/hadoop-resourcemanager-will-not-start-bad-scheduler-sett....
Created 12-15-2015 10:37 PM
You implied that it might be an issue with Ambari so I looked at my version and it was 2.1.0. I changed my .repo file to use 2.1.2 instead and the problem went away! It also cured a problem I had with Kafka too. Thanks for the tip.
Is there any way in the ambari.repo file I'm using to always use the latest Ambari instead of it hard coded (like 2.1.2)?
Created 12-15-2015 10:41 PM
Yes I have seen several Bug reports in our internal Jira.
Regarding the hard coded ambari version, I dont know of any setting that will always download the latest version
Created 02-02-2016 01:15 PM
Hi, Can anyone told me how to set parameters through the config tab of Ambari? I wanna know why I cann’t change anything during the config tab of Ambari, seems like it’s readonly.. Thanks.
Created 02-03-2016 04:13 AM
Problem resolved, I find out that it's possible to change paras in the service configs tab of Ambari, but not in the host configs tab.
Created 02-05-2016 06:39 AM
See my answer which shows the screen where I made this modification.
Created on 02-05-2016 06:38 AM - edited 08-19-2019 05:34 AM
FYI, the method I used to remove the 2 offending parameters was to do it through Ambari. If you navigate to the config tab of YARN you can go to the scheduler section and delete the 2 options in the Capacity Scheduler textbox. That textbox shows the 2 options like so: