Support Questions

Find answers, ask questions, and share your expertise

How to restrict valid users from submitting Spark YARN Job?

avatar
Expert Contributor

Hi Guys,

On our Kerberized HDP, I tested that a valid A/D user once granted the TGT using kinit, is able to submit the spark job (using spark shell and also using spark-submit). However, I would like to restrict a few groups and users from submitting the job to the cluster. Is there a way around?

I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark... which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user)

Thanks.

1 ACCEPTED SOLUTION

avatar
Guru

You can do this with YARN ACLs against the relevant yarn queues. A better way of implementing this would be to use Ranger to set YARN access controls to a given set of queues. You would then require spark users to add queue to their submit job, and the authentication would be enforced. To ensure this, prevent user access to the default queue.

View solution in original post

3 REPLIES 3

avatar
Guru

You can do this with YARN ACLs against the relevant yarn queues. A better way of implementing this would be to use Ranger to set YARN access controls to a given set of queues. You would then require spark users to add queue to their submit job, and the authentication would be enforced. To ensure this, prevent user access to the default queue.

avatar
Expert Contributor

Thanks @Simon Elliston Ball I will try that

BTW:

I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark... which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user)

avatar
Super Collaborator

This doc is just showing you an example.

Instead of using the principal "spark/blue1@EXAMPLE.COM", we could also consider using the principal "app1/blue1@EXAMPLE.COM" for one application, then using "app2/blue1@EXAMPLE.COM" for a second application etc.