Created 11-02-2018 07:10 PM
I am receiving the following message from each node manager when attempting to start YARN after enabling docker. What is the root cause?
2018-11-02 18:28:50,974 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:checkVersion(1662)) - Loaded NM state version info 1.2 2018-11-02 18:28:51,174 INFO resources.ResourceHandlerModule (ResourceHandlerModule.java:initNetworkResourceHandler(182)) - Using traffic control bandwidth handler 2018-11-02 18:28:51,193 WARN resources.CGroupsBlkioResourceHandlerImpl (CGroupsBlkioResourceHandlerImpl.java:checkDiskScheduler(101)) - Device vda does not use the CFQ scheduler; disk isolation using CGroups will not work on this partition. 2018-11-02 18:28:51,199 INFO resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(410)) - Initializing mounted controller blkio at /sys/fs/cgroup/blkio/hadoop-yarn 2018-11-02 18:28:51,199 INFO resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(420)) - Yarn control group does not exist. Creating /sys/fs/cgroup/blkio/hadoop-yarn 2018-11-02 18:28:51,200 ERROR nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:init(323)) - Failed to bootstrap configured resource subsystems! org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Unexpected: Cannot create yarn cgroup Subsystem:blkio Mount points:/proc/mounts User:yarn Path:/sys/fs/cgroup/blkio/hadoop-yarn at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:425) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:377) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsBlkioResourceHandlerImpl.bootstrap(CGroupsBlkioResourceHandlerImpl.java:123) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:320) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013) 2018-11-02 18:28:51,205 INFO service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state INITED org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013) Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems! at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391) ... 3 more 2018-11-02 18:28:51,207 ERROR nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(936)) - Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013) Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems! at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391) ... 3 more
Created 11-10-2018 12:59 AM
Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")
Created 11-02-2018 07:14 PM
I was able to work around this error by running:
sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn
I then received a very similar message for "/sys/fs/cgroup/memory/hadoop-yarn" and "/sys/fs/cgroup/cpu/hadoop-yarn". After creating these directories as well, the node managers came up.
Here is the full work-around that was run on each node:
sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn sudo mkdir /sys/fs/cgroup/memory/hadoop-yarn sudo chown -R yarn:yarn /sys/fs/cgroup/memory/hadoop-yarn sudo mkdir /sys/fs/cgroup/cpu/hadoop-yarn sudo chown -R yarn:yarn /sys/fs/cgroup/cpu/hadoop-yarn
Created 11-10-2018 12:59 AM
Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")