Support Questions

Find answers, ask questions, and share your expertise

Files created after running oozie shell action are owned by yarn user

avatar
Super Guru

Hi,

I am running simple shell action using HUE(logged in as hdfs user in hue) -

$ cat test.sh
echo "hello" > /tmp/test

The workflow is getting executed successfully. When i check the files permission and ownership -

$ ls -al /tmp/test
-rw-r--r-- 1 yarn hadoop 6 2016-05-25 14:43 /tmp/test

The above output shows the file created via shell action has ownership as yarn.

How can I make oozie shell action to get the ownership to be same as the user who is running the "shell action/workflow"(in this case "hdfs")

So i am expecting output as shown below -

-rw-r--r-- 1 hdfs hadoop 6 2016-05-25 14:43 /tmp/test
1 ACCEPTED SOLUTION

avatar
Expert Contributor
@Sagar Shimpi

By default the shell actions are not allowed to run as another user as sudo is blocked. If you want a yarn application to run as someone other than yarn (i.e. the submitter), then you need to enable the linux container executor so that the containers are started up by the submitting user. Also note the below setting information which also needs to be changed as well to achieve this.

With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=false (default), it runs as yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user (default is 'nobody')

With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=true, it runs as the user submitting the workflow.

Stating that there are issues around this also where it does not work as expected because of the issues https://issues.apache.org/jira/browse/YARN-2424

https://issues.apache.org/jira/browse/YARN-3462

The current suggestion that I can make is to add line to change the ownership of the file which was created using shell.

View solution in original post

8 REPLIES 8

avatar
Master Guru

@Sagar Shimpi

You may need to enable proxyuser.

User ProxyUser Configuration

Oozie supports impersonation or proxyuser functionality (identical to Hadoop proxyuser capabilities and conceptually similar to Unix 'sudo').

Proxyuser enables other systems that are Oozie clients to submit jobs on behalf of other users.

Because proxyuser is a powerful capability, Oozie provides the following restriction capabilities (similar to Hadoop):

  • Proxyuser is an explicit configuration on per proxyuser user basis.
  • A proxyuser user can be restricted to impersonate other users from a set of hosts.
  • A proxyser user can be restricted to impersonate users belonging to a set of groups.

There are 2 configuration properties needed to set up a proxyuser:

  • oozie.service.ProxyUserService.proxyuser.#USER#.hosts: hosts from where the user #USER# can impersonate other users.
  • oozie.service.ProxyUserService.proxyuser.#USER#.groups: groups the users being impersonated by user #USER# must belong to.

Both properties support the '*' wildcard as value. Although this is recommended only for testing/development.

avatar
Super Guru

@Sunile Manjee

I tried to set the property in oozie-site.xml with #user# as hdfs but still didnt worked.

avatar
Master Guru

I assume restarted oozie?

avatar
Super Guru

Yes. I did oozie restart after doing the modifications.

avatar
Rising Star

this is a known limitation in non-secure clusters, whereby the containers are running as YARN user and not running as logged user. try setting this

<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>

avatar
Super Guru

@ibhatt

I already tried this but this didnt worked for me.

avatar
Expert Contributor
@Sagar Shimpi

By default the shell actions are not allowed to run as another user as sudo is blocked. If you want a yarn application to run as someone other than yarn (i.e. the submitter), then you need to enable the linux container executor so that the containers are started up by the submitting user. Also note the below setting information which also needs to be changed as well to achieve this.

With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=false (default), it runs as yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user (default is 'nobody')

With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=true, it runs as the user submitting the workflow.

Stating that there are issues around this also where it does not work as expected because of the issues https://issues.apache.org/jira/browse/YARN-2424

https://issues.apache.org/jira/browse/YARN-3462

The current suggestion that I can make is to add line to change the ownership of the file which was created using shell.

avatar
Super Guru

Thanks for the info.