Support Questions

Find answers, ask questions, and share your expertise

YARN config to skip staging dir ownership check

avatar
Contributor

When a YARN/MR Job is submitted it checks the staging directory ownership and if it doesn't matches with the user who is submitting the job, it throws below exception.

Staging directory path is referred from YARN config [yarn.app.mapreduce.am.staging-dir = /tmp/hadoop-yarn/staging]

java.io.IOException: The ownership on the staging directory /tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . The directory must be owned by the submitter hdfs or hdfs
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)

Is there a YARN config which skips the ownership check for the staging directory.

I am facing this issue with OzoneFS, not with HDFS.

Ownership check happens in below file : https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-...

Any workaround to bypass or skip the check?

4 REPLIES 4

avatar
Expert Contributor
@Soumitra Sulav

The problem seems to be because the FileStatus returned by OzoneFileSystem does not have the owner field set and so its empty. As a result the ownership check fails.

One workaround I see is to delete the /tmp/hadoop-yarn/staging/hdfs/.staging directory before submitting the Mapreduce job. Then this ownership check gets bypassed and the staging directory will be created again.

But this means that you can't have more than one job using the /tmp/hadoop-yarn/staging/hdfs/.staging directory. So its not a good workaround, although the only available one from what I see (Apart from code change in Mapreduce/Ozone) .

avatar
Contributor

This workaround will mean that for each and every job I will have to delete the staging-dir before submitting any new job and also at a moment single user will be able to run a job.

avatar
Expert Contributor

Yes. Actually a user can only run a single job at any moment. To run multiple jobs at a moment, they all need to be submit as different users.

avatar
New Contributor

- you have to change the owner of the file 

 

hadoop_amine@amine:/home/amine$ hadoop fs -chown -R hadoop_amine:hadoop_group /tmp/hadoop-yarn/staging/hadoop_amine/