Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

YARN Node Managers failing to start after enabling docker

avatar
Rising Star

I am receiving the following message from each node manager when attempting to start YARN after enabling docker. What is the root cause?

2018-11-02 18:28:50,974 INFO  recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:checkVersion(1662)) - Loaded NM state version info 1.2
2018-11-02 18:28:51,174 INFO  resources.ResourceHandlerModule (ResourceHandlerModule.java:initNetworkResourceHandler(182)) - Using traffic control bandwidth handler
2018-11-02 18:28:51,193 WARN  resources.CGroupsBlkioResourceHandlerImpl (CGroupsBlkioResourceHandlerImpl.java:checkDiskScheduler(101)) - Device vda does not use the CFQ scheduler; disk isolation using CGroups will not work on this partition.

2018-11-02 18:28:51,199 INFO  resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(410)) - Initializing mounted controller blkio at /sys/fs/cgroup/blkio/hadoop-yarn
2018-11-02 18:28:51,199 INFO  resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(420)) - Yarn control group does not exist. Creating /sys/fs/cgroup/blkio/hadoop-yarn
2018-11-02 18:28:51,200 ERROR nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:init(323)) - Failed to bootstrap configured resource subsystems! 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Unexpected: Cannot create yarn cgroup Subsystem:blkio Mount points:/proc/mounts User:yarn Path:/sys/fs/cgroup/blkio/hadoop-yarn 
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:425)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:377)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsBlkioResourceHandlerImpl.bootstrap(CGroupsBlkioResourceHandlerImpl.java:123)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:320)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
2018-11-02 18:28:51,205 INFO  service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state INITED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems!
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	... 3 more
2018-11-02 18:28:51,207 ERROR nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(936)) - Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems!
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	... 3 more

1 ACCEPTED SOLUTION

avatar
Rising Star

Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")

View solution in original post

2 REPLIES 2

avatar
Rising Star

I was able to work around this error by running:

sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn

I then received a very similar message for "/sys/fs/cgroup/memory/hadoop-yarn" and "/sys/fs/cgroup/cpu/hadoop-yarn". After creating these directories as well, the node managers came up.

Here is the full work-around that was run on each node:

sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn
sudo mkdir /sys/fs/cgroup/memory/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/memory/hadoop-yarn
sudo mkdir /sys/fs/cgroup/cpu/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/cpu/hadoop-yarn

avatar
Rising Star

Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")