Support Questions

Find answers, ask questions, and share your expertise

OOzie shell action-scp with user

avatar
Expert Contributor

In oozie shell action I have written code to copy the files from remote server using scp command and I have enabled the password less access to remote server to the user who submit the workflow.

 

Below are the workflow.xml

 

 

<workflow-app name="oracle_log" xmlns="uri:oozie:workflow:0.4">
<start to="scp_copy"/>
<action name="scp_copy">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>bin/scp-hadoop.sh</exec>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>bin/scp-hadoop.sh#scp-hadoop.sh</file>
</shell>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

scp-hadoop.sh

 

hadoop fs -mkdir /user/{user_name}/dd
scp -r user@xxx:/data/input/ddd.txt /home/{user_name}/

First line in the scp-hadoop.sh--> It successfully creates the directory with the same user the worflow the submit.

 

But the second line always communicates to remote machine with  Yarn user..

 

Why It communicating as YARN user to the remote instead the user which i pass in 

 

HADOOP_USER_NAME=${wf:user()}

 

Please kindly help me.

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Mentor
In unsecured mode, all YARN container processes execute as the Linux local user "yarn". This cannot be changed unless you either enable Kerberos based security or explicitly turn on the LinuxContainerExecutor [1], which will also require ensuring that local Linux accounts exist for all job submitting user.

The HADOOP_USER_NAME value affects only 'hadoop' and other related Apache Hadoop/Ecosystem commands. Since the 'scp' program is not a Hadoop program, it does not get influenced by the username carried by that variable. It instead runs as the linux user that runs the shell script - which is "yarn" due to the above.

[1] - https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_other_hadoop_security.html#to... and 'Always Use Linux Container Executor' under CM -> YARN -> Configuration

View solution in original post

3 REPLIES 3

avatar
Mentor
In unsecured mode, all YARN container processes execute as the Linux local user "yarn". This cannot be changed unless you either enable Kerberos based security or explicitly turn on the LinuxContainerExecutor [1], which will also require ensuring that local Linux accounts exist for all job submitting user.

The HADOOP_USER_NAME value affects only 'hadoop' and other related Apache Hadoop/Ecosystem commands. Since the 'scp' program is not a Hadoop program, it does not get influenced by the username carried by that variable. It instead runs as the linux user that runs the shell script - which is "yarn" due to the above.

[1] - https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_other_hadoop_security.html#to... and 'Always Use Linux Container Executor' under CM -> YARN -> Configuration

avatar
New Contributor

Hi Harsh,

 

I tried chaning the property yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users to false under YARN Client Advanced Configuration Snippet (Safety Valve) for yarn-site.xml setting in cloudera manager and i had restarted the service. Desipte that i am getting the permission issue error. 

 

I had also changed the setting yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user to the concerned user other than yarn. Still i am getting the error?

 

Any workarounds?

 

Thanks,

Mahesh.

avatar
Expert Contributor
Have you found any way to run YARN container as the user who'is launched it? I also have set these two and have all the nodes sync with LDAP, still runs as nobody despite the fact I can see it says the yarn user request is ...