Support Questions

Find answers, ask questions, and share your expertise

ResourceManager cannot start

avatar
Rising Star

After building out a HDP 2.2 cluster (single node) using blueprint I'm getting the following error around the ResourceManager.

$ less /var/log/hadoop-yarn/yarn/yarn-yarn-resourcemanager-gsc01-ost-tesla-h-hb01.td.local.log
STARTUP_MSG: Starting ResourceManager
STARTUP_MSG:   host = gsc01-ost-tesla-h-hb01.td.local/192.168.106.26
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.6.0.2.2.9.0-3393
...
2015-12-15 01:01:47,671 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service RMActiveServices failed in state INITED; cause: java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100].
java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100].
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.internalGetLabeledQueueCapacity(CapacitySchedulerConfiguration.java:465)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getLabeledQueueCapacity(CapacitySchedulerConfiguration.java:477)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadCapacitiesByLabelsFromConf(CSQueueUtils.java:143)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadUpdateAndCheckCapacities(CSQueueUtils.java:122)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupConfigurableCapacities(AbstractCSQueue.java:99)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupQueueConfigs(AbstractCSQueue.java:242)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:109)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.<init>(ParentQueue.java:100)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:589)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:465)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:297)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:326)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:576)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1016)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:269)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1237)
2015-12-15 01:01:47,672 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping ResourceManager metrics system...

My blueprint file is intentionally sparse so I'm only calling out components without setting any configurations unless needed.

{
  "host_groups" : [
    {
      "name" : "host_group_1",
      "configurations" : [ ],
      "components" : [
        { "name" : "ZOOKEEPER_SERVER" },
        { "name" : "ZOOKEEPER_CLIENT" },

...

  ],
  "Blueprints" : {
    "stack_name" : "HDP",
    "stack_version" : "2.2"
  }

I suspect this message a bit up in the logs might be related:

2015-12-15 01:01:47,598 INFO  conf.Configuration (Configuration.java:getConfResourceAsInputStream(2236)) - found resource capacity-scheduler.xml at file:/etc/hadoop/conf.empty/capacity-scheduler.xml
2015-12-15 01:01:47,663 WARN  capacity.CapacitySchedulerConfiguration (CapacitySchedulerConfiguration.java:getAccessibleNodeLabels(433)) - Accessible node labels for root queue will be ignored, it will be automatically set to "*".
2015-12-15 01:01:47,668 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler failed in state INITED; cause: java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100].

Looking in the mentioned .xml file:

    <property>
      <name>yarn.scheduler.capacity.root.accessible-node-labels.default.capacity</name>
      <value>-1</value>
    </property>
    <property>
      <name>yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity</name>
      <value>-1</value>
    </property>

Do I just need to set these in my blueprint file?

NOTE: Here's the full .xml file: capacity-schedulerxml.txt

EDIT #1

I took these 2 properties out of the above .xml file and attempted to restart ResourceManager, but it's still throwing the same exception:

2015-12-15 10:40:51,231 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1241)) - Error starting ResourceManager
java.lang.IllegalArgumentException: Illegal capacity of -1.0 for node-label=default in queue=root, valid capacity should in range of [0, 100].
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.internalGetLabeledQueueCapacity(CapacitySchedulerConfiguration.java:465)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getLabeledQueueCapacity(CapacitySchedulerConfiguration.java:477)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadCapacitiesByLabelsFromConf(CSQueueUtils.java:143)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadUpdateAndCheckCapacities(CSQueueUtils.java:122)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupConfigurableCapacities(AbstractCSQueue.java:99)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupQueueConfigs(AbstractCSQueue.java:242)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:109)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.<init>(ParentQueue.java:100)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:589)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:465)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:297)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:326)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:576)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1016)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:269)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1237)
2015-12-15 10:40:51,233 INFO  resourcemanager.ResourceManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
1 ACCEPTED SOLUTION

avatar

This sounds like a bug, and both values are removed in newer Ambari versions.

https://issues.apache.org/jira/browse/AMBARI-13232

Could you remove the following two values and try to restart the RM:

  • yarn.scheduler.capacity.root.accessible-node-labels.default.capacity
  • yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity

View solution in original post

16 REPLIES 16

avatar
Rising Star

avatar
New Contributor
@Jonas Straub

You implied that it might be an issue with Ambari so I looked at my version and it was 2.1.0. I changed my .repo file to use 2.1.2 instead and the problem went away! It also cured a problem I had with Kafka too. Thanks for the tip.

Is there any way in the ambari.repo file I'm using to always use the latest Ambari instead of it hard coded (like 2.1.2)?

avatar

Yes I have seen several Bug reports in our internal Jira.

Regarding the hard coded ambari version, I dont know of any setting that will always download the latest version

avatar
New Contributor

Hi, Can anyone told me how to set parameters through the config tab of Ambari? I wanna know why I cann’t change anything during the config tab of Ambari, seems like it’s readonly.. Thanks.

avatar
New Contributor

Problem resolved, I find out that it's possible to change paras in the service configs tab of Ambari, but not in the host configs tab.

avatar
Rising Star

See my answer which shows the screen where I made this modification.

avatar
Rising Star

FYI, the method I used to remove the 2 offending parameters was to do it through Ambari. If you navigate to the config tab of YARN you can go to the scheduler section and delete the 2 options in the Capacity Scheduler textbox. That textbox shows the 2 options like so:

1829-ambari-gsc01-ost.png