Hi,
I've just upgraded the cluster from 5.14 to 5.16, however none of the node managers will start. They give the error:
2018-12-04 13:26:15,283 DEBUG org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: checkLinuxExecutorSetup: [/var/lib/yarn-ce/bin/container-executor, --checksetup]
2018-12-04 13:26:15,287 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code from container executor initialization is : 24
ExitCodeException exitCode=24: Invalid conf file provided : /var/lib/yarn-ce/etc/hadoop/container-executor.cfg
at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
at org.apache.hadoop.util.Shell.run(Shell.java:507)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:267)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
I've read various articles about permissions etc, none of which seem to work. The file /var/lib/yarn-ce/etc/hadoop/container-executor.cfg seems to be recreated every time I attempt to start the node manager. I know there was some bug fixes in 5.15 and 5.16 releating to this.
Any help would be greatly apprichated as the cluster is currently down.