Created 08-22-2018 11:42 AM
I'm trying to use user impersonation for the Spark interpreter in yarn-cluster mode to reduce the load of the Zeppelin host. The HDP cluster is kerberized and Zeppelin is in version 0.8.0. After trying to execute some spark code I get the following error:
java.lang.RuntimeException: Error: Only one of --proxy-user or --principal can be provided.
A short Google search shows that you can't use spark-submit with both a principal and proxy user, what Zeppelin is trying to do I guess?
Does anyone know how I can fix this and use the Spark interpreter in yarn-cluster mode with user impersonation on a kerberized cluster? Any help would be appreciated.
I think that I have found a solution (if I run my spark2 interpreter zeppelin notebook, it appears as a job in the yarn as a login user and not the zeppelin service user).
Zeppelin - 0.8.0
Zeppelin interpreter - spark2
Cluster: Kerberized, LDAP synchronization with sssd
I added to the spark2 interpreter in zeppelin (solution from: https://community.cloudera.com/t5/Support-Questions/How-to-make-Zeppelin-s-User-Impersonation-work-w...😞
and also deleted:
I also have set in spark2 interpreter (solution from: https://zeppelin.apache.org/docs/0.8.0/usage/interpreter/user_impersonation.html😞
The interpreter will be instantiated Per User in isolated process
User impersonate - checked
Next in (solution from: https://community.cloudera.com/t5/Support-Questions/HAWQ-Issues-with-Ranger-KMS/td-p/172956) Ambari -> Ranger KMS -> CONFIGS -> ADVANCED -> I added:
And I saved it and restarted the Ranger KMS Server service.
And know it works for me.