Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to start the node manager

Re: Unable to start the node manager

@Shelton any update on this 

Re: Unable to start the node manager

Mentor

@saivenkatg55 

Sorry festive period, can you do the following.

Delete old messages in /var/log/messages all that have the extension /var/log/messages.x that should leave you with only one /var/log/messages then truncate that file so you will have only new entries

# truncate --size 0 /var/log/messages

Do the same for /var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-<node_name>.log.x and also truncate the /var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-<node_name>.log

# truncate --size 0 /var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-<node_name>.log


Start manually the node manager

# su -l yarn -c "/usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh start nodemanager"


Then share the latest files created below

  • /var/log/messages
  • /var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-<node_name>.log
  • /var/lib/ambari-agent/data/errors-xxx.txt

Please revert 

Re: Unable to start the node manager

  • @Shelton Please check your mail. Kindly check and update me 
Highlighted

Re: Unable to start the node manager

Mentor

@saivenkatg55 

 

I see in the hadoop-yarn-nodemanager-w0lxdhdp05.ifc.org.log errors pointing to "Unable to start NodeManager: Could not load library. Reasons: [no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6279667856305652637.8 (Permission denied)]

 

My suspicion:

 

Please verify that /tmp on the host does not have the noexec option set. You can verify this by running /bin/mount and checking the mount options. If you are able to, remount /tmp without noexec and try starting the NodeManager again. I am sure its issue with noexec on /tmp.

See my sample output

[root@tokyo ~]# /bin/mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,size=7167976k,nr_inodes=1791994,mode=755)
.......
...
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=15609)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime)
/dev/sda1 on /boot type ext4 (rw,relatime,data=ordered)
/dev/sda5 on /opt type ext4 (rw,relatime,data=ordered)
/dev/sda8 on /home type ext4 (rw,relatime,data=ordered)
/dev/sda11 on /u02 type ext4 (rw,relatime,data=ordered)
/dev/sda6 on /var type ext4 (rw,relatime,data=ordered)
/dev/sda10 on /u01 type ext4 (rw,relatime,data=ordered)
/dev/sda9 on /tmp type ext4 (rw,relatime,data=ordered)


This issue occurs when the user running the Hadoop [Nodemanager start] process does not have the necessary rights and cannot generate temporary files under the /tmp directory.

 

Solution

- Allow the user running node manager startup process read/write/execute access on /tmp
- Remove the noexec parameter when mounting /tmp
- Change the execution rights on /tmp. ie: sudo chmod 777 /tmp

 

In the /var/log/messages I  can also see
Jan 2 05:14:23 w0lxdhdp05 abrt-server: Package 'ambari-agent' isn't signed with proper key
Jan 2 05:14:23 w0lxdhdp05 abrt-server: 'post-create' on '/var/spool/abrt/Python-2020-01-02-05:14:22-11897' exited with 1
Jan 2 05:14:23 w0lxdhdp05 abrt-server: Deleting problem directory '/var/spool/abrt/Python-2020-01-02-05:14:22-11897'

 

Please edit /etc/abrt/abrt-action-save-package-data.conf change the value for OpenGPGCheck should be changed from yes to no.

OpenGPGCheck = no

It might also be necessary to change the value of limit coredumpsize:

limit coredumpsize unlimited

After editing the file restart the process with the following command:

# service abrtd restart

Restart the node manager and share your joy !

Re: Unable to start the node manager

@Shelton As checked, the /tmp does not have noexec enabled. Please provide an alternate solution for this.

/dev/mapper/rootvg-tmp on /tmp type xfs (rw,relatime,attr2,inode64,noquota)

Re: Unable to start the node manager

@Shelton Any update on this? looks like it is looking for some java packages 

java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-4657625312215122883.8 (Permission denied)]

can we install it externally?

Don't have an account?
Coming from Hortonworks? Activate your account here