Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Zeppelin Spark interpreter on a kerberized cluster with encrypted zones fails

avatar
Contributor

We're using HDP 2.6.0.3 with Active Directory/kerberos and using Ranger/Ranger KMS to handle encrypted zones. If we try to get data from this encrypted zone via %spark2 interpreter in Zeppelin like

%spark2.sql
select * from encrypted_datalake.artikel_ref limit 30

, we've got the following error in the spark interpreter log:

Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User:zeppelin not allowed to do 'DECRYPT_EEK' on 'bi-master-key'

Maybe that's why the delecation user zeppelin has not the right to decrypt the key from the encrypted zone. But the user from my login has this right and the %jdbc interpreter that is using hive as delegation user has this access and I can query data from this zone like

%jdbc(hive)
select * from encrypted_datalake.artikel_ref limit 10

without any errors. How can switch the zeppelin user to a kerberized user?

1 ACCEPTED SOLUTION

avatar

@Ramon Wartala

By design, zeppelin's spark and spark2 interpreters would always execute your query as 'zeppelin' user and they dont support user impersonation. Hence it is bound to fail if 'zeppelin' user doesn't have the permissions to decrypt the key.

jdbc, livy and livy2 interpreters support user impersonation and so your scenario would pass with any of these : %livy.sql, %livy2.sql and %jdbc(hive)

View solution in original post

11 REPLIES 11

avatar

@Ramon Wartala

By design, zeppelin's spark and spark2 interpreters would always execute your query as 'zeppelin' user and they dont support user impersonation. Hence it is bound to fail if 'zeppelin' user doesn't have the permissions to decrypt the key.

jdbc, livy and livy2 interpreters support user impersonation and so your scenario would pass with any of these : %livy.sql, %livy2.sql and %jdbc(hive)

avatar
Contributor

Ok, I see. But why the livy and livy2 interpreter are not installed in HDP 2.6.0.3 per default? I can't find a installation routine for both interpreters.

avatar

@Ramon Wartala

I would suggest to check if Livy and Livy2 are present under Spark and Spark2 services respectively . If Livy and Livy2 servers are not installed on the cluster, then corresponding interpreters wont be present in Zeppelin

check this out : https://issues.apache.org/jira/browse/AMBARI-19919

avatar
Contributor

@Kshitij Badani, you're right. I restart the Livy2 server and remove the Zeppelin service from Ambari and clean all config files on the host location and reinstall the Zeppelin service. After that, the Livy2 interpreter was available. But now, I've got an error if I try to connect with it. The zeppelin-interpreter-livy2-livy-zeppelin...log shows me the following error:

ERROR [2017-06-29 18:54:04,427] ({pool-2-thread-7} BaseLivyInterprereter.java[callRestAPI]:416) - Error with 401 StatusCode:
ERROR [2017-06-29 18:54:04,427] ({pool-2-thread-7} BaseLivyInterprereter.java[createSession]:214) - Error when creating livy session for user r00138
org.apache.zeppelin.livy.LivyException: Error with 401 StatusCode:
	at org.apache.zeppelin.livy.BaseLivyInterprereter.callRestAPI(BaseLivyInterprereter.java:448)
	at org.apache.zeppelin.livy.BaseLivyInterprereter.createSession(BaseLivyInterprereter.java:191)
	at org.apache.zeppelin.livy.BaseLivyInterprereter.initLivySession(BaseLivyInterprereter.java:98)
	at org.apache.zeppelin.livy.BaseLivyInterprereter.open(BaseLivyInterprereter.java:80)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:482)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

User r00138 is my kerberos user. Should I need to set zeppelin.livy.principal oder zeppelin.livy.keytab with the zeppelin proxysuser? Actually the user and the keytab is empty in my configuration. Or should I setup my user credentials under 'Credentials'?

avatar

avatar
Contributor

I modify all properties as in this article and I checked every property twice but I've still got

javax.security.auth.login.LoginException: Unable to obtain password from user

inside Zeppelin. And

INFO [2017-06-30 08:44:02,849] ({Thread-0} RemoteInterpreterServer.java[run]:95) - Starting remote interpreter server on port 15012
 INFO [2017-06-30 08:44:03,209] ({pool-1-thread-2} RemoteInterpreterServer.java[createInterpreter]:190) - Instantiate interpreter org.apache.zeppelin.livy.LivyPySparkInterpreter
 INFO [2017-06-30 08:44:03,231] ({pool-1-thread-2} RemoteInterpreterServer.java[createInterpreter]:190) - Instantiate interpreter org.apache.zeppelin.livy.LivySparkInterpreter
 INFO [2017-06-30 08:44:03,234] ({pool-1-thread-2} RemoteInterpreterServer.java[createInterpreter]:190) - Instantiate interpreter org.apache.zeppelin.livy.LivySparkSQLInterpreter
 INFO [2017-06-30 08:44:03,235] ({pool-1-thread-2} RemoteInterpreterServer.java[createInterpreter]:190) - Instantiate interpreter org.apache.zeppelin.livy.LivyPySpark3Interpreter
 INFO [2017-06-30 08:44:03,237] ({pool-1-thread-2} RemoteInterpreterServer.java[createInterpreter]:190) - Instantiate interpreter org.apache.zeppelin.livy.LivySparkRInterpreter
 INFO [2017-06-30 08:44:03,270] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:131) - Job remoteInterpretJob_1498805043269 started by scheduler interpreter_1470680829
ERROR [2017-06-30 08:44:03,640] ({pool-2-thread-2} BaseLivyInterprereter.java[createSession]:214) - Error when creating livy session for user r00138
org.apache.zeppelin.livy.LivyException: org.springframework.web.client.RestClientException: Error running rest call; nested exception is javax.security.auth.login.LoginException: Unable to obtain password from user



inside the

/var/log/zeppelin/zeppelin-interpreter-livy2-livy-zeppelin-hdp-cluster-master3.log

avatar

@Ramon Wartala Please paste screenshot of livy2 interpreter configs and also full /etc/livy2/conf/livy.conf file from your livy2 server host

avatar
Contributor
# Generated by Apache Ambari. Fri Jun 30 14:11:54 2017


livy.environment production
livy.impersonation.enabled true
livy.repl.enableHiveContext true
livy.server.access_control.enabled true
livy.server.access_control.users livy,zeppelin
livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
livy.server.auth.kerberos.principal HTTP/_HOST@TCHIBO.TCHIBOROOT.NET
livy.server.auth.type kerberos
livy.server.csrf_protection.enabled true
livy.server.launch.kerberos.keytab /etc/security/keytabs/livy2.service.keytab
livy.server.launch.kerberos.principal livy/_HOST@TCHIBO.TCHIBOROOT.NET
livy.server.port 8999
livy.server.recovery.mode recovery
livy.server.recovery.state-store filesystem
livy.server.recovery.state-store.url /livy2-recovery
livy.server.session.timeout 3600000
livy.spark.master yarn-cluster
livy.superusers zeppelin-datalake

avatar

@Ramon Wartala Please attach screenshot of livy2 interpreter config as well. Also, Likewise in this article, https://discuss.pivotal.io/hc/en-us/articles/201914097-Hadoop-daemons-in-a-secured-cluster-fails-to-...

are you seeing any statement like this in your zeppelin logs?

java.io.IOException: Login failure for hdfs/dev6ha@SATURN.LOCAL from keytab /etc/security/phd/keytab/hdfs.service.keytab