Created on 07-06-2014 08:34 AM - edited 09-16-2022 02:01 AM
Hello,
I am using CDH 5.0 Quickstart VM for VirtualBox. I run a mapreduce job and it completes successfully. The reduce code has several System.out.println statements which should be logged in the job log. But when I open logs in Hue>Jobbrowser, I get the error message:
Error getting logs for job_1404657916663_0001
When I see the Hue logs in /var/log/hadoop-mapreduce/hadoop-cmf-yarn-JOBHISTORY-localhost.localdomain.log.out, I find exceptions like the one below:
2014-07-06 08:18:20,767 ERROR org.apache.hadoop.yarn.webapp.View: Error getting logs for job_1404659730687_0001
org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=EXECUTE, inode="/tmp/logs/cloudera/logs":cloudera:supergroup:drwxrwx---
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:168)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5461)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5443)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:5405)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1680)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1632)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1612)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1586)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:482)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
So, I do:
hdfs dfs -chmod -R -777 /tmp/logs/cloudera
drwxrwxrwx - cloudera supergroup 0 2014-07-06 08:23 /tmp/logs/cloudera/logs
Then I run the napreduce job again. I still get the same error in job browser.
The permissions to the log directory have reverted.
drwxrwx--- - cloudera supergroup 0 2014-07-06 08:23 /tmp/logs/cloudera/logs
What should be done to get the logs for the job?
Thanks in advance.
Lohith.
Created 05-11-2015 06:33 AM
Greetings,
The proper perms are
/tmp/logs should be 777 and should have sticky bit set (the 1 in the 1777, this makes all child directories inherit the group of parent dir aka hadoop).
sudo -u hdfs hdfs dfs -chmod 1777 /tmp/logs
sudo -u hdfs hdfs dfs -chown mapred:hadoop /tmp/logs
sudo -u hdfs hdfs dfs -chgrp -R hadoop /tmp/logs
After this I would restart your JobHistory server.
Created 07-08-2014 04:14 PM
Created 07-09-2014 04:20 AM
Hi Romain,
Thanks for the reply.
I did a
sudo -u hdfs hadoop fs -chown -R mapred:mapred /tmp/logs
When I ran jobs as 'Cloudera', no new logs were written to the above directory. In Hue, I got a message that no logs were available and aggregation may not have been complete.
Then I reverted as follows:
sudo -u hdfs hadoop fs -chown -R cloudera:supergroup /tmp/logs
Then, when the jobs ran, log directories were created in /tmp/logs/cloudera/logs. However, in Hue, I get the original error message.
The new directories were created as below:
drwxrwx--- - cloudera supergroup 0 2014-07-09 04:03 /tmp/logs/cloudera/logs/application_1404900446410_0005
-rw-r----- 1 cloudera supergroup 3755405 2014-07-09 04:03 /tmp/logs/cloudera/logs/application_1404900446410_0005/localhost.localdomain_8041
The permissions are not correct.
In the files in the tmp folder, I can see the log statements from my reduce program. But it is not available in Hue.
What should I do?
Thanks in advance!
Created 04-15-2015 12:27 AM
Hello,
I 'm having the same problem:
I tried:
sudo -u hdfs hadoop fs -ls -R /tmp
but this didn't worked...
The log files are written in the tmp folder, but hue is not finding them..
Created 05-11-2015 06:33 AM
Greetings,
The proper perms are
/tmp/logs should be 777 and should have sticky bit set (the 1 in the 1777, this makes all child directories inherit the group of parent dir aka hadoop).
sudo -u hdfs hdfs dfs -chmod 1777 /tmp/logs
sudo -u hdfs hdfs dfs -chown mapred:hadoop /tmp/logs
sudo -u hdfs hdfs dfs -chgrp -R hadoop /tmp/logs
After this I would restart your JobHistory server.
Created on 08-10-2016 03:40 PM - edited 08-10-2016 03:42 PM
Thank you mageru9 . This worked!!
The proper perms are
/tmp/logs should be 777 and should have sticky bit set (the 1 in the 1777, this makes all child directories inherit the group of parent dir aka hadoop).
sudo -u hdfs hdfs dfs -chmod 1777 /tmp/logs
sudo -u hdfs hdfs dfs -chown mapred:hadoop /tmp/logs
sudo -u hdfs hdfs dfs -chgrp -R hadoop /tmp/logs
After this I would restart your JobHistory server.
Created 03-15-2017 06:24 PM
Was unable to re direct completed spark jobs from yarn to spark history server even though all permissions and spark conf was set correctly.
Might be useful.
The issue was we were passing a spark.conf file while submitting the spark job hoping the config changes would be aggregated with default parameters from default spark.conf.
Turns out it overrides the default spark config file. Even if you pass blank spark conf it will not consider the default spark.conf for the job.
We had to below 3 lines on the custom spark conf file to enable log aggregation at spark history server and URL at resource manager to point to spark history server.
This has to be done with every spark job. If a job is submitted with below 3 parms it will not be available in spark history server even if u restart anything.
```spark.eventLog.enabled=true
spark.eventLog.dir=hdfs://nameservice1/user/spark/applicationHistory
spark.yarn.historyServer.address=http://sparkhist-dev.visibleworld.com:18088```
https://community.cloudera.com/t5/CDH-Manual-Installation/Permission-denied-user-mapred-access-WRITE...
Created 07-24-2018 01:29 PM
We were unable to access Spark app log files from either YARN or Spark History Server UIs, with error "Error getting logs at <worker_node>:8041". We can see the logs with "yarn logs" command. Turns out our yarn.nodemanager.remote-app-log-dir = /tmp/logs, but the directory was owned by "yarn:yarn". Following your instruction fixed the issue.
Thanks a lot!
Miles
Created 09-03-2019 08:39 AM
Hi,
For me it's happening while finding YARN logs using yarn logs command. The /tmp/logs directory is having below permissions.
drwxrwxrwt - yarn yarn
But inside logs directory, the directories are having permission as below.
drwxrwxrwt - user/owner_of_directory yarn
Your guidance will be helpful.
Thanks,
Created 07-24-2018 05:59 PM