Created 02-11-2016 03:05 AM
When user submits job via Spark/Samza to Yarn, job gets executed as "yarn" user, how can we make sure that job should run as same user who has submitted the job.
Please advise.
Created 02-11-2016 06:53 PM
I believe this needs to handled with kerberos, if not kerberized it will submit as yarn.
Created 02-11-2016 06:53 PM
I believe this needs to handled with kerberos, if not kerberized it will submit as yarn.
Created 02-15-2016 09:35 AM
I believe we can do something like this:
For example if you are running spark shell then you can add below configurations in core-site.xml and run your job with --proxy-user <username>
<property> <name>hadoop.proxyuser.<username>.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.<username>.groups</name> <value>*</value> </property>Command to run spark shell with YARN with proxy user:
spark-shell --master yarn-client --proxy-user <username>
Created 05-16-2016 09:50 AM
It didn't work for me. Am getting below exception
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): <proxyuser> tries to renew a token with renewer <loggeduser>
Created 11-11-2016 09:02 AM
Is possible to follow above approach in Kerberos environment? I tried above step to run job as proxy user but it failed. Got GSS initialization exception. Any pointers?
Created 02-16-2016 05:33 PM
note that even when running as OS user "yarn", an environment variable, "HADOOP_USER_NAME" passes the name of the account submitting the work into that process, which is then picked up by the HDFS client: the code should be able to work with HDFS directories as the submitter, with the same permissions and things. That is, as you may have guessed, completely insecure and open to abuse —for that you need to make the leap to Kerberos, I'm afraid.