Support Questions

Find answers, ask questions, and share your expertise

Unable to start YARN - NodeManager on role NodeManager

avatar
Explorer

 

Execute command Start this NodeManager on role NodeManager 

Failed to start role

Supervisor returned FATAL. Please check the role log file, stderr, or stdout.

 

Environment details:

 

Version: Cloudera Express 5.15.0
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java VM Vendor: Oracle Corporation
Java Version: 1.7.0_67
 
System details:
Linux optim-rhel72-uppu.development.unicomglobal.software 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

 

I have followed the steps under "Configuring TLS/SSL for HDFS, YARN and MapReduce" using the link https://www.cloudera.com/documentation/enterprise/5-15-x/topics/sg_hive_encryption.html
 
Service did not start successfully; not all of the required roles started: only 0/1 roles started. Reasons : Service has only 0 NodeManager roles running instead of minimum required 1

I see below error in the role log:

 

Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
 at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:269)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
 at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
 at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:199)
 at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:267)
 ... 3 more
Caused by: ExitCodeException exitCode=24: Invalid conf file provided : /etc/hadoop/conf.cloudera.yarn/container-executor.cfg
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
 at org.apache.hadoop.util.Shell.run(Shell.java:507)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
 at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
 ... 4 more
 
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at optim-rhel72-uppu.development.unicomglobal.software/10.1.72.3
************************************************************/
 
Any help highly appreciated.
 
Thanks,
Tulasi
 

 

 

 

1 ACCEPTED SOLUTION

avatar
Guru

Hi @Tulasi,

 

Sorry for my late reply. From the output you sent below:

 

[root@optim-rhel72-uppu ~]# id yarn
uid=1007(yarn) gid=1010(hadoop) groups=1010(hadoop)

 

This looks a little different than my test cluster. Can you please do this?

usermod -g yarn yarn
usermod -a -G hadoop yarn

 

Also, please paste the content of this file:

/opt/cloudera/parcels/CDH/meta/permissions.json

 

Thanks,

Li 

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

View solution in original post

17 REPLIES 17

avatar
Explorer

Hi Li,

 

This is what I am getting:

 

[root@optim-rhel72-uppu ~]# id yarn
uid=1007(yarn) gid=1010(hadoop) groups=1010(hadoop)

 

Thanks,

Tulasi

avatar
Expert Contributor

Is nosuid set on the mount point?  I had a simialr issue documented here: http://community.cloudera.com/t5/Cloudera-Manager-Installation/URGENT-Cluster-unavailable-after-upgr...

avatar
Explorer

This is the content of my /etc/fstab file

-----------------------------------

/dev/mapper/rhel_rhel72-root /                       xfs     defaults        0 0
UUID=d762b842-5c87-4e4d-bc0e-7a6bad357604 /boot                   xfs     defaults        0 0
/dev/mapper/rhel_rhel72-home /home                   xfs     defaults        0 0
/dev/mapper/rhel_rhel72-swap swap                    swap    defaults        0 0

 

 -----------------------------------

Do I need to change anything?

 

Thanks.

avatar
Guru

Hi @Tulasi,

 

Sorry for my late reply. From the output you sent below:

 

[root@optim-rhel72-uppu ~]# id yarn
uid=1007(yarn) gid=1010(hadoop) groups=1010(hadoop)

 

This looks a little different than my test cluster. Can you please do this?

usermod -g yarn yarn
usermod -a -G hadoop yarn

 

Also, please paste the content of this file:

/opt/cloudera/parcels/CDH/meta/permissions.json

 

Thanks,

Li 

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Explorer

Hi Li,

 

Thanks for being on top of it and helping me in solving the problem.

 

usermod -g yarn yarn

usermod -a -G hadoop yarn

 

Above two commands fixed my problem.

[root@optim-rhel72-uppu meta]# id yarn
uid=1007(yarn) gid=1008(yarn) groups=1008(yarn),1010(hadoop)

 

I have no idea how yarn user permissions are changed, all that I am following is that what have been suggested in cloudera instructions to enable encryption.

 

Thanks to all of the folks for providing suggestions.

 

Problems like this sucks lot of time in identifying where to fix and I would request cloudera to improve such situations.

 

Thanks,

Tulasi

 

 

avatar
Guru

Hi @Tulasi,

 

Greate to hear the issue got resolved! I will report internally on this to our documentation team to see how we can improve on it.

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Expert Contributor

Hi @Tulasi,

 

Try to kill running process on port used by Yarn Services and then Try to restart.

 

Regards,

Manu.

avatar
Explorer
Hi Manu, I tried your suggestion and it didn't work. Thanks, Tulasi