- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Yarn: One nodemanager refuse to start
- Labels:
-
Apache Hadoop
-
Apache YARN
-
Cloudera Manager
Created on ‎04-18-2014 06:43 AM - edited ‎09-16-2022 01:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I am currently encountering a problem with one nodemanager.
I used a snapshot rollback on the cluster, and since, only this nodemanager (1 of 3) is having trouble.
we can see:
chmod: changing permissions of `/var/run/cloudera-scm-agent/process/608-yarn-NODEMANAGER/container-executor.cfg': Operation not permitted chmod: changing permissions of `/var/run/cloudera-scm-agent/process/608-yarn-NODEMANAGER/topology.map': Operation not permitted
So I tried to give the proper ownership to these files (yarn:hadoop).
But then cloudera manager recreate the configuration under
var/run/cloudera-scm-agent/process/609-yarn-NODEMANAGER/
So I had like ten folders for yarn under the cloudera agent process... I removed those but it continued to increment the number of folder as I tried to start it again.
So Cloudera manager seems to create these 2 files with root:root ownership everytime, which is weird since It shouldn't be able to do it.
I clearly don't understand what's going on here.
Any hint to help me resolve it ?
Thanks 🙂
Lefevre Kevin
Created ‎04-18-2014 09:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Those errors won't cause any actual problems with your runtime. They will appear on perfectly functional NodeManagers. We need to find the real error.
Is there some kind of fatal error at the end of stderr log? In the NodeManager role logs?
Thanks,
Darren
Created ‎04-18-2014 09:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Those errors won't cause any actual problems with your runtime. They will appear on perfectly functional NodeManagers. We need to find the real error.
Is there some kind of fatal error at the end of stderr log? In the NodeManager role logs?
Thanks,
Darren
Created ‎11-10-2014 08:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is the solution? We have the same issue with starting YARN
Created ‎11-18-2014 07:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Supervisor returned FATAL. Please check the role log file, stderr, or stdout.
I have the same issue, when i try to start nodemanger it complains about operation not permitted
chmod: changing permissions of `/var/run/cloudera-scm-agent/process/3669-yarn-NODEMANAGER/container-executor.cfg': Operation not permitted
chmod: changing permissions of `/var/run/cloudera-scm-agent/process/3669-yarn-NODEMANAGER/topology.map': Operation not permitted
+ exec /usr/lib/hadoop-yarn/bin/yarn nodemanager
Created ‎11-19-2014 12:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you changed somethin in directory or file permissions in /var/run?
If yes, you should probably reconfigure YARN to use a NEW directory (for example if YARN used /data/yarn/nm for NodeManager, configure a new path as /data/yarn/nm2) After setting changing EVERY directory for YARN and restarting the Cluster the YARN started, created the new directories and set the permissions correctly, so now we dont have this kind of problem with permissions.
If you didnt change any permission in the local file system, then I dont know what is the issue. Try another user - such as run for example a hive job under root/hdfs/yarn or other user, to see whether this is user related or it fails always.
T.
Created ‎11-18-2014 07:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
chmod: changing permissions of `/var/run/cloudera-scm-agent/process/3669-yarn-NODEMANAGER/container-executor.cfg': Operation not permitted
chmod: changing permissions of `/var/run/cloudera-scm-agent/process/3669-yarn-NODEMANAGER/topology.map': Operation not permitted
+ exec /usr/lib/hadoop-yarn/bin/yarn nodemanager
Created ‎03-18-2015 06:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Whats the solution to this?
Created ‎11-23-2015 12:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎11-21-2014 10:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
<property>
<name>yarn.nodemanager.recovery.dir</name>
<value>/var/lib/yarn-nm-recovery</value>
</property>
(Please create that /var/lib/yarn-nm-recovery directory, and change the owner to the `yarn' user.)
And if you're not running YARN HA, then I'm at a lost. Could you paste your NM log, from /var/log/hadoop-yarn/...?
Created ‎12-05-2014 06:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just learned that this has nothing to do with YARN HA. So you're likely to be running into NM recovery issue.
If you upgrade to Cloudera Manager 5.2.1 (or later), it'll automatically defaults the recovery dir to a non-tmp location. So you'll be good. If you can't upgrade, you can manually set that config in the previous post.
