- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Unable to start Node Manager - java.lang.UnsatisfiedLinkError
- Labels:
-
Apache Ambari
-
Apache YARN
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a problem with new nodes added to the cluster HDP 3.0.1, hdfs service is ok but NodeManager service not start with this errors:
/var/lib/ambari-agent/data/errors-26141.txt
resource_management.core.exceptions.ExecutionFailed: Execution of 'ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/3.0.1.0-187/hadoop/libexec && /usr/hdp/3.0.1.0-187/hadoop-yarn/bin/yarn --config /usr/hdp/3.0.1.0-187/hadoop/conf --daemon start nodemanager' returned 1. Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. Command line is not complete. Try option "help" TERM environment variable not set. ERROR: Cannot set priority of nodemanager process 34389
TERM env variable is set
NodeManager.log
STARTUP_MSG: java = 1.8.0_112
************************************************************/
2020-03-04 15:18:35,735 INFO nodemanager.NodeManager (LogAdapter.java:info(51)) - registered UNIX signal handlers for [TERM, HUP, INT]
2020-03-04 15:18:36,133 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:openDatabase(1540)) - Using state database at /var/log/hadoop-yarn/n
odemanager/recovery-state/yarn-nm-state for recovery
2020-03-04 15:18:36,143 ERROR nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(936)) - Error starting NodeManager
java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in ja
va.library.path, Permission denied]
at org.fusesource.hawtjni.runtime.Library.doLoad(Library.java:182)
at org.fusesource.hawtjni.runtime.Library.load(Library.java:140)
at org.fusesource.leveldbjni.JniDBFactory.<clinit>(JniDBFactory.java:48)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:1543)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1531)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:353)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:285)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:358)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
2020-03-04 15:18:36,149 INFO service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state STOPPED
I have reviewed the execution permissions of the temporary directories.
- /tmp
- /var/tmp
- /var/lib/ambari-agent/tmp/
The java.library.path files were also copied from an old node.
The NodeManager service runs with root user but with yarn user don´t start.
Created on ‎03-06-2020 11:15 PM - edited ‎03-06-2020 11:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi san_t_o Thanks for adding more context.
When cleaning the directories indicated, the libraries are copied automatically or it is necessary to copy them manually?
--> /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/* and also on /var/lib/ambari-agent/tmp/
I was testing in my local cluster. Apologies, I meant to clear /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/ and sorry for the typo in my previous comment.
Even before clearing off these directories or altering the location it would be best to review with strace once. It traces all system level calls and reviewing the last call prior to failure could give us more clues. To install strace - you can run
yum -y install strace
export HADOOP_LIBEXEC_DIR=/usr/hdp/3.0.1.0-187/hadoop/libexec
strace -f -s 2000 -o problematic_node /usr/hdp/3.0.1.0-187/hadoop-yarn/bin/yarn --debug --config /usr/hdp/3.0.1.0-187/hadoop/conf --daemon start nodemanager
export HADOOP_LIBEXEC_DIR=/usr/hdp/3.0.1.0-187/hadoop/libexec
strace -f -s 2000 -o good_node /usr/hdp/3.0.1.0-187/hadoop-yarn/bin/yarn --debug --config /usr/hdp/3.0.1.0-187/hadoop/conf --daemon start nodemanager
The file problematic_node and good_node would have the traces and can you attach/paste them here.
Created ‎03-09-2020 11:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi @venkatsambath,
I attach the strace output.
problematic_node
good_node
I found this lines:
Problematic Node
49213 stat("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8", 0x7f19c4ef1800) = -1 ENOENT (No such file or directory)
49213 open("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 EACCES (Permission denied)
Good Node:
19290 stat("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-280379409949290123.8", 0x7fbf276ef800) = -1 ENOENT (No such file or directory)
19290 open("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-280379409949290123.8", O_RDWR|O_CREAT|O_EXCL, 0666) = 354
The permissions to "hadoop_java_io_tmpdir" in the problematic nodo :
# stat hadoop_java_io_tmpdir/
File: ‘hadoop_java_io_tmpdir/’
Size: 8192 Blocks: 24 IO Block: 4096 directory
Device: fd02h/64770d Inode: 29362223 Links: 39
Access: (1777/drwxrwxrwt) Uid: ( 1073/ hdfs) Gid: ( 1051/ hadoop)
Access: 2020-03-09 12:29:40.811107572 -0500
Modify: 2020-03-09 12:29:38.659084671 -0500
Change: 2020-03-09 12:29:38.659084671 -0500
I'll be waiting for your comments about.
Regards.
Created ‎03-09-2020 11:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here you have the strace outputs:
https://drive.google.com/open?id=1aktK1QqOqY0ub5-qd35XSy5bUUj2hyIt
https://drive.google.com/open?id=1cebTRxbcjdkXFtdQ2adEDMBaJlZg3SPP
Regards.
Created ‎03-10-2020 07:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
49213 open("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 EACCES (Permission denied)
During this step, the script is trying to open and get file descriptor for this directory and it was denied access
/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
So far we have inspected its parent directories and haven't seen any issues with. Can we get details of this directory too
ls -ln /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
stat /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
id yarn
Created ‎03-11-2020 10:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think and the same, we have analized its parents directories and apparently they are fine.
# ls -ln /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
ls: cannot access /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8: No such file or directory
# id yarn
uid=1075(yarn) gid=1051(hadoop) groups=1051(hadoop)
I have managed to start the node from the command line, but I have detected that the command that is executed from Ambari, sets the permissions of the path "hadoop_java_io_tmpdir" as owner to the user "hdfs:hadoop", however I do not identify why the user yarn does not have permissions writing and execution, the permissions are 1777 and the yarn user is member of hadoop group.
Regards

- « Previous
-
- 1
- 2
- Next »