Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN config to skip staging dir ownership check

YARN config to skip staging dir ownership check

Cloudera Employee

When a YARN/MR Job is submitted it checks the staging directory ownership and if it doesn't matches with the user who is submitting the job, it throws below exception.

Staging directory path is referred from YARN config [yarn.app.mapreduce.am.staging-dir = /tmp/hadoop-yarn/staging]

java.io.IOException: The ownership on the staging directory /tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . The directory must be owned by the submitter hdfs or hdfs
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)

Is there a YARN config which skips the ownership check for the staging directory.

I am facing this issue with OzoneFS, not with HDFS.

Ownership check happens in below file : https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-...

Any workaround to bypass or skip the check?

3 REPLIES 3

Re: YARN config to skip staging dir ownership check

Rising Star
@Soumitra Sulav

The problem seems to be because the FileStatus returned by OzoneFileSystem does not have the owner field set and so its empty. As a result the ownership check fails.

One workaround I see is to delete the /tmp/hadoop-yarn/staging/hdfs/.staging directory before submitting the Mapreduce job. Then this ownership check gets bypassed and the staging directory will be created again.

But this means that you can't have more than one job using the /tmp/hadoop-yarn/staging/hdfs/.staging directory. So its not a good workaround, although the only available one from what I see (Apart from code change in Mapreduce/Ozone) .

Re: YARN config to skip staging dir ownership check

Cloudera Employee

This workaround will mean that for each and every job I will have to delete the staging-dir before submitting any new job and also at a moment single user will be able to run a job.

Re: YARN config to skip staging dir ownership check

Rising Star

Yes. Actually a user can only run a single job at any moment. To run multiple jobs at a moment, they all need to be submit as different users.