Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

YARN Node Managers failing to start after enabling docker

avatar
Rising Star

I am receiving the following message from each node manager when attempting to start YARN after enabling docker. What is the root cause?

2018-11-02 18:28:50,974 INFO  recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:checkVersion(1662)) - Loaded NM state version info 1.2
2018-11-02 18:28:51,174 INFO  resources.ResourceHandlerModule (ResourceHandlerModule.java:initNetworkResourceHandler(182)) - Using traffic control bandwidth handler
2018-11-02 18:28:51,193 WARN  resources.CGroupsBlkioResourceHandlerImpl (CGroupsBlkioResourceHandlerImpl.java:checkDiskScheduler(101)) - Device vda does not use the CFQ scheduler; disk isolation using CGroups will not work on this partition.

2018-11-02 18:28:51,199 INFO  resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(410)) - Initializing mounted controller blkio at /sys/fs/cgroup/blkio/hadoop-yarn
2018-11-02 18:28:51,199 INFO  resources.CGroupsHandlerImpl (CGroupsHandlerImpl.java:initializePreMountedCGroupController(420)) - Yarn control group does not exist. Creating /sys/fs/cgroup/blkio/hadoop-yarn
2018-11-02 18:28:51,200 ERROR nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:init(323)) - Failed to bootstrap configured resource subsystems! 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Unexpected: Cannot create yarn cgroup Subsystem:blkio Mount points:/proc/mounts User:yarn Path:/sys/fs/cgroup/blkio/hadoop-yarn 
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:425)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:377)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsBlkioResourceHandlerImpl.bootstrap(CGroupsBlkioResourceHandlerImpl.java:123)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:320)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
2018-11-02 18:28:51,205 INFO  service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state INITED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems!
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	... 3 more
2018-11-02 18:28:51,207 ERROR nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(936)) - Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:393)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
Caused by: java.io.IOException: Failed to bootstrap configured resource subsystems!
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:324)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
	... 3 more

1 ACCEPTED SOLUTION

avatar
Rising Star

Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")

View solution in original post

2 REPLIES 2

avatar
Rising Star

I was able to work around this error by running:

sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn

I then received a very similar message for "/sys/fs/cgroup/memory/hadoop-yarn" and "/sys/fs/cgroup/cpu/hadoop-yarn". After creating these directories as well, the node managers came up.

Here is the full work-around that was run on each node:

sudo mkdir /sys/fs/cgroup/blkio/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/blkio/hadoop-yarn
sudo mkdir /sys/fs/cgroup/memory/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/memory/hadoop-yarn
sudo mkdir /sys/fs/cgroup/cpu/hadoop-yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/cpu/hadoop-yarn

avatar
Rising Star

Turns out, the guide that I was following was outdated. I tried this again on a different cluster and it worked perfectly with the Ambari default for yarn.nodemanager.linux-container-executor.cgroups.mount-path ("/cgroup")