Created on 01-24-2019 01:07 AM - edited 09-16-2022 07:05 AM
Execute command Start this NodeManager on role NodeManager
Failed to start role
Supervisor returned FATAL. Please check the role log file, stderr, or stdout.
Environment details:
Version: Cloudera Express 5.15.0
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java VM Vendor: Oracle Corporation
Java Version: 1.7.0_67
System details:
Linux optim-rhel72-uppu.development.unicomglobal.software 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
I have followed the steps under "Configuring TLS/SSL for HDFS, YARN and MapReduce" using the link https://www.cloudera.com/documentation/enterprise/5-15-x/topics/sg_hive_encryption.html
Service did not start successfully; not all of the required roles started: only 0/1 roles started. Reasons : Service has only 0 NodeManager roles running instead of minimum required 1
I see below error in the role log:
Created 02-11-2019 02:53 PM
Hi @Tulasi,
Sorry for my late reply. From the output you sent below:
[root@optim-rhel72-uppu ~]# id yarn
uid=1007(yarn) gid=1010(hadoop) groups=1010(hadoop)
This looks a little different than my test cluster. Can you please do this?
usermod -g yarn yarn usermod -a -G hadoop yarn
Also, please paste the content of this file:
/opt/cloudera/parcels/CDH/meta/permissions.json
Thanks,
Li
Li Wang, Technical Solution Manager
Created 01-28-2019 10:33 AM
Created 01-28-2019 10:43 PM
Hi Jerry,
Here is what I have on my system:
/etc/hadoop/conf.cloudera.yarn
-r-------- 1 root hadoop 156 Jan 24 01:00 container-executor.cfg
-rw-r--r-- 1 root root 3894 Jan 17 22:56 core-site.xml
-rw-r--r-- 1 root root 617 Jan 17 22:56 hadoop-env.sh
-rw-r--r-- 1 root root 2729 Jan 17 22:56 hdfs-site.xml
Even if I change above file permission, after start, it changes back to the same permission.
From manager I have this
Container Executor Group = yarn
Upgrade also not allowing as it requires all services should be up and running.
Let me know if you need any more details.
Thanks,
Tulasi
Created 01-29-2019 06:47 AM
Hi Tulasi,
Could you check the value of this property Container Executor Group from the file "container-executor.cfg" file and cross check with CM configuration
Thanks
Jerry
Created 01-29-2019 09:20 AM
Created 01-29-2019 06:59 PM
Hi @Tulasi,
Could you please send us the output of below command on all the NodeManager hosts?
ls -alt /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
The correct permission should be like this:
---Sr-s--- 1 root yarn 53728 Jan 28 14:03 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
If it looks different, you can perform the following steps on all NodeManagers:
chmod 6050 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor chgrp yarn /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
Thanks and hope this helps,
Li
Li Wang, Technical Solution Manager
Created 01-29-2019 08:36 PM
Hi Li,
I changed the permission, still it didn't fix the problem.
[root@optim-rhel72-uppu bin]# ls -alt /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
---Sr-s--- 1 root yarn 53712 May 24 2018 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
In this below, is it something to do with banned.users?
[root@optim-rhel72-uppu conf.cloudera.yarn]# cat /etc/hadoop/conf.cloudera.yarn/container-executor.cfg
yarn.nodemanager.linux-container-executor.group=yarn
min.user.id=1000
allowed.system.users=nobody,impala,hive,llama,hbase
banned.users=hdfs,yarn,mapred,bin
Thanks,
Tulasi
Created 01-30-2019 10:24 AM
Hi @Tulasi,
The banned.users property is to prevent jobs from being submitted using those user accounts. It should not cause NodeManager not able to start problem.
I suggest you checking these doc links:
and
How many nodes does your cluster have? Have you checked all the permissions?
Thanks,
Li
Li Wang, Technical Solution Manager
Created 01-30-2019 11:51 PM
Hi Li,
Everything on a single node.
/opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin
[root@optim-rhel72-uppu bin]# ls -lrt
total 80
-rwxr-xr-x 1 root root 12476 May 24 2018 yarn
-rwxr-xr-x 1 root root 5463 May 24 2018 mapred
---Sr-s--- 1 root yarn 53712 May 24 2018 container-executor
This is the error from /var/log/hadoop-yarn/hadoop-cmf-yarn-NODEMANAGER-optim-rhel72-uppu.development.unicomglobal.software.log.out
2019-01-30 23:42:45,872 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:269)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
Caused by: java.io.IOException: Cannot run program "/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop-yarn/bin/container-executor": error=13, Permission denied
Thanks,
Tulasi
Created 01-31-2019 01:19 PM
Hi @Tulasi,
Could you please run below command and send us the output?
id yarn
Thanks,
Li
Li Wang, Technical Solution Manager