The documentation for CDH 5.9 talks about the --principal, --keytab, and --proxy-user arguments to spark-submit. However, the newer versions of that same doc page don't even mention these options anymore (CDH 5.10, CDH 5.11, CDH 6.2). I have read conflicting things about how to use these options from various sources, so am trying to get the definitive explanation of them, if you will. Where are these options documented in the newer CDH versions? Thanks.
There are som changes in the documentation, and we have similar statements on a new page, that is more specific for long running spark on YARN jobs in cluster mode:
For the jobs run less than 7 days ( that is default life time of a ticket), you should be able to just login to KDC using the "kinit" command, and run the job.
Did you manage to get any resolution for this? I am able to run spark job as a --proxy-user under yarn cluster mode. However I can successfully run using yarn-client mode.
This is when using CDH 6.2.1 version of Spark.
There is no problem when using opensource version of Spark with --proxy-user either on client or cluster mode.