01-24-2019 01:07 AM - last edited on 01-24-2019 06:48 AM by cjervis
Execute command Start this NodeManager on role NodeManager
Failed to start role
Supervisor returned FATAL. Please check the role log file, stderr, or stdout.
Version: Cloudera Express 5.15.0
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java VM Vendor: Oracle Corporation
Java Version: 1.7.0_67
Linux optim-rhel72-uppu.development.unicomglobal.software 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
I have followed the steps under "Configuring TLS/SSL for HDFS, YARN and MapReduce" using the link https://www.cloudera.com/documentation/enterprise/5-15-x/topics/sg_hive_encryption.html
Service did not start successfully; not all of the required roles started: only 0/1 roles started. Reasons : Service has only 0 NodeManager roles running instead of minimum required 1
I see below error in the role log:
01-28-2019 10:33 AM
01-28-2019 10:43 PM
Here is what I have on my system:
-r-------- 1 root hadoop 156 Jan 24 01:00 container-executor.cfg
-rw-r--r-- 1 root root 3894 Jan 17 22:56 core-site.xml
-rw-r--r-- 1 root root 617 Jan 17 22:56 hadoop-env.sh
-rw-r--r-- 1 root root 2729 Jan 17 22:56 hdfs-site.xml
Even if I change above file permission, after start, it changes back to the same permission.
From manager I have this
Container Executor Group = yarn
Upgrade also not allowing as it requires all services should be up and running.
Let me know if you need any more details.
01-29-2019 06:47 AM
Could you check the value of this property Container Executor Group from the file "container-executor.cfg" file and cross check with CM configuration
01-29-2019 09:20 AM
01-29-2019 06:59 PM
Could you please send us the output of below command on all the NodeManager hosts?
ls -alt /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
The correct permission should be like this:
---Sr-s--- 1 root yarn 53728 Jan 28 14:03 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
If it looks different, you can perform the following steps on all NodeManagers:
chmod 6050 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor chgrp yarn /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
Thanks and hope this helps,
01-29-2019 08:36 PM
I changed the permission, still it didn't fix the problem.
[root@optim-rhel72-uppu bin]# ls -alt /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
---Sr-s--- 1 root yarn 53712 May 24 2018 /opt/cloudera/parcels/CDH/lib/hadoop-yarn/bin/container-executor
In this below, is it something to do with banned.users?
[root@optim-rhel72-uppu conf.cloudera.yarn]# cat /etc/hadoop/conf.cloudera.yarn/container-executor.cfg
01-30-2019 10:24 AM
The banned.users property is to prevent jobs from being submitted using those user accounts. It should not cause NodeManager not able to start problem.
I suggest you checking these doc links:
How many nodes does your cluster have? Have you checked all the permissions?