Support Questions

KuldeepK · ‎02-11-2016

When user submits job via Spark/Samza to Yarn, job gets executed as "yarn" user, how can we make sure that job should run as same user who has submitted the job.

Please advise.

iroberts · ‎02-11-2016

@Kuldeep Kulkarni

I believe this needs to handled with kerberos, if not kerberized it will submit as yarn.

View solution in original post

iroberts · ‎02-11-2016

@Kuldeep Kulkarni

I believe this needs to handled with kerberos, if not kerberized it will submit as yarn.

KuldeepK · ‎02-15-2016

@Ian Roberts

I believe we can do something like this:

For example if you are running spark shell then you can add below configurations in core-site.xml and run your job with --proxy-user <username>

<property> 
<name>hadoop.proxyuser.<username>.hosts</name> 
<value>*</value> 
</property> 

<property> 
<name>hadoop.proxyuser.<username>.groups</name> 
<value>*</value> 
</property>

Command to run spark shell with YARN with proxy user:

spark-shell --master yarn-client --proxy-user <username>

Vinod254581 · ‎05-16-2016

It didn't work for me. Am getting below exception

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): <proxyuser> tries to renew a token with renewer <loggeduser>

shashi_vish123 · ‎11-11-2016

Is possible to follow above approach in Kerberos environment? I tried above step to run job as proxy user but it failed. Got GSS initialization exception. Any pointers?

stevel · ‎02-16-2016

note that even when running as OS user "yarn", an environment variable, "HADOOP_USER_NAME" passes the name of the account submitting the work into that process, which is then picked up by the HDFS client: the code should be able to work with HDFS directories as the submitter, with the same permissions and things. That is, as you may have guessed, completely insecure and open to abuse —for that you need to make the leap to Kerberos, I'm afraid.

Cloudera Community

Support Questions

User impersonation in Spark and Samza