Support Questions
Find answers, ask questions, and share your expertise

User impersonation for Spark yarn-cluster mode not working in Zeppelin in a kerberized HDP cluster

Hey there,

I'm trying to use user impersonation for the Spark interpreter in yarn-cluster mode to reduce the load of the Zeppelin host. The HDP cluster is kerberized and Zeppelin is in version 0.8.0. After trying to execute some spark code I get the following error:

java.lang.RuntimeException: Error: Only one of --proxy-user or --principal can be provided.

A short Google search shows that you can't use spark-submit with both a principal and proxy user, what Zeppelin is trying to do I guess?

Does anyone know how I can fix this and use the Spark interpreter in yarn-cluster mode with user impersonation on a kerberized cluster? Any help would be appreciated.

Best regards,

Markus

2 REPLIES 2

I have exactly the same problem with the Zeppelin version 0.8.0 when I enable user impersonation in the spark2 interpreter.

I think that I have found a solution (if I run my spark2 interpreter zeppelin notebook, it appears as a job in the yarn as a login user and not the zeppelin service user).

 

My environment:

HDP-3.1.4.0

Zeppelin - 0.8.0

Zeppelin interpreter - spark2

Cluster: Kerberized, LDAP synchronization with sssd

 

I added to the spark2 interpreter in zeppelin (solution from: https://community.cloudera.com/t5/Support-Questions/How-to-make-Zeppelin-s-User-Impersonation-work-w...😞

zeppelin.spark.keytab=/etc/security/keytabs/zeppelin.server.kerberos.keytab
zeppelin.spark.principal=zeppelin-mycluster@MYREALM.COM

and also deleted:

spark.yarn.keytab=/etc/security/keytabs/zeppelin.server.kerberos.keytab 
spark.yarn.principal=zeppelin-mycluster@MYREALM.COM

I also have set in spark2 interpreter (solution from: https://zeppelin.apache.org/docs/0.8.0/usage/interpreter/user_impersonation.html😞

Option

The interpreter will be instantiated Per User in isolated process

User impersonate - checked

 

Next in (solution from: https://community.cloudera.com/t5/Support-Questions/HAWQ-Issues-with-Ranger-KMS/td-p/172956) Ambari -> Ranger KMS -> CONFIGS -> ADVANCED -> I added:

 

hadoop.kms.proxyuser.zeppelin.hosts=*

hadoop.kms.proxyuser.zeppelin.users=*

 

And I saved it and restarted the Ranger KMS Server service.

 

And know it works for me.

; ;