I am working in a kerberized cluster where a user is submitting a SQOOP job on the edge node. The actual MR job on the worker nodes run under the user 'yarn'.
Is there any way we can configure YARN to use the end user's user id for launching the MR process on the worker nodes?
It applies to both (MR and Sqoop) but the job in question is being kicked off using SQOOP. It is more critical for the jobs that are making the connection to external systems like SQL Server as this whole environment is kerberized and without the user's kerberos ticket (which will happen only if the job is running under that user's id)
Now as per this documentation - https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/SecureContainer.html ... "YARN containers in a secure cluster use the operating system facilities to offer execution isolation for containers. Secure containers execute under the credentials of the job user. The operating system enforces access restriction for the container. The container must run as the user that submitted the application."
1. Do you have user xyz who is submitting job on all the nodemanagers?
2. Do you have below parameter set in /etc/hadoop/conf/container-executor.cfg
3. Can you confirm if you are using below property in yarn-site.xml
yarn.nodemanager.container-executor.class = org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor