Support Questions

Find answers, ask questions, and share your expertise

Livy and Hive Warehouse Connector with Kerberos

avatar
Explorer

I have configured Spark/Zeppelin as described here:

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_zeppelin_conf...

https://community.hortonworks.com/articles/223626/integrating-apache-hive-with-apache-spark-hive-war...

Spark and Spark interpreter in Zepeplin works well. But Livy is not.

Here is stacktrace from Zepeplin:

Caused by: java.sql.SQLException: Cannot create PoolableConnectionFactory (Could not open client transport for any of the Server URI's in ZooKeeper: Peer indicated failure: Unsupported mechanism type PLAIN)
	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2291)
	at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2038)
	at org.apache.commons.dbcp2.BasicDataSource.getLogWriter(BasicDataSource.java:1588)
	at org.apache.commons.dbcp2.BasicDataSourceFactory.createDataSource(BasicDataSourceFactory.java:588)
	at com.hortonworks.spark.sql.hive.llap.JDBCWrapper.getConnector(HS2JDBCWrapper.scala:333)
	at com.hortonworks.spark.sql.hive.llap.JDBCWrapper.getConnector(HS2JDBCWrapper.scala:340)
	at com.hortonworks.spark.sql.hive.llap.DefaultJDBCWrapper.getConnector(HS2JDBCWrapper.scala)
	at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.lambda$new$0(HiveWarehouseSessionImpl.java:48)
	at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.execute(HiveWarehouseSessionImpl.java:66)
	... 12 more
Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Peer indicated failure: Unsupported mechanism type PLAIN
	at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:333)
	at shadehive.org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
	at org.apache.commons.dbcp2.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:39)
	at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:256)
	at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:2301)
	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2287)
	... 20 more
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: Unsupported mechanism type PLAIN
	at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199)
	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307)
	at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
	at shadehive.org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:420)
	at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:301)
	... 25 more

And here is container logs:

...
18/12/26 22:56:17 INFO ClientCnxn: Socket connection established, initiating session, client: /169.254.4.10:36348, server: hdp-master02.hadoop.local/169.254.4.8:2181
18/12/26 22:56:17 INFO ClientCnxn: Session establishment complete on server hdp-master02.hadoop.local/169.254.4.8:2181, sessionid = 0x267ea6bea4a00a1, negotiated timeout = 60000
18/12/26 22:56:17 INFO ConnectionStateManager: State change: CONNECTED
18/12/26 22:56:17 INFO CuratorFrameworkImpl: backgroundOperationsLoop exiting
18/12/26 22:56:17 INFO ZooKeeper: Session: 0x267ea6bea4a00a1 closed
18/12/26 22:56:17 INFO ClientCnxn: EventThread shut down
18/12/26 22:56:17 WARN HiveConnection: Failed to connect to hdp-master03.hadoop.local:10500
18/12/26 22:56:17 INFO CuratorFrameworkImpl: Starting
18/12/26 22:56:17 INFO ZooKeeper: Initiating client connection, connectString=hdp-master01.hadoop.local:2181,hdp-master02.hadoop.local:2181,hdp-master03.hadoop.local:2181 sessionTimeout=60000 watcher=shadecurator.org.apache.curator.ConnectionState@64f34919
18/12/26 22:56:17 INFO ClientCnxn: Opening socket connection to server hdp-master01.hadoop.local/169.254.4.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/26 22:56:17 INFO ClientCnxn: Socket connection established, initiating session, client: /169.254.4.10:55506, server: hdp-master01.hadoop.local/169.254.4.7:2181
18/12/26 22:56:17 INFO ClientCnxn: Session establishment complete on server hdp-master01.hadoop.local/169.254.4.7:2181, sessionid = 0x167ea6be68f00be, negotiated timeout = 60000
18/12/26 22:56:17 INFO ConnectionStateManager: State change: CONNECTED
18/12/26 22:56:17 INFO CuratorFrameworkImpl: backgroundOperationsLoop exiting
18/12/26 22:56:17 INFO ZooKeeper: Session: 0x167ea6be68f00be closed
18/12/26 22:56:17 INFO ClientCnxn: EventThread shut down
18/12/26 22:56:17 ERROR Utils: Unable to read HiveServer2 configs from ZooKeeper

Environment variables from app container:

spark.security.credentials.hiveserver2.enabled	true
spark.sql.hive.hiveserver2.jdbc.url.principal	hive/_HOST@hadoop.local
spark.sql.hive.hiveserver2.jdbc.url	jdbc:hive2://hdp-master01.hadoop.local:2181,hdp-master02.hadoop.local:2181,hdp-master03.hadoop.local:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
livy.spark.yarn.security.credentials.hiveserver2.enabled true 

Can anyone help me?

3 REPLIES 3

avatar
Explorer

Following errors was in livy server log "livy-livy-server.out":

18/12/27 00:51:58 INFO LineBufferedStream: stdout: 18/12/27 00:51:58 WARN HiveServer2CredentialProvider: Failed to get HS2 delegation token
...
18/12/27 00:51:58 INFO LineBufferedStream: stdout: Caused by: shadehive.org.apache.hive.service.cli.HiveSQLException: Error retrieving delegation token for user
...
18/12/27 00:51:58 INFO LineBufferedStream: stdout: Caused by: org.apache.hadoop.security.authorize.AuthorizationException: Unauthorized connection for super-user: hive/hdp-master03.hadoop.local@HADOOP.LOCAL from IP 169.254.4.8

Problem: livy host was not in the hadoop.proxyuser.hive.hosts variable of core-site.xml.

Solution: add host with livy server to hadoop.proxyuser.hive.hosts

avatar
Expert Contributor

You where able to make Hive Warehouse Connector work with Kerberos in the Spark Zeppelin interpreter when using USER IMPERSONATION, or only running with the "zeppelin" user??

In my case al the other Spark client interfaces (spark-shell, pyspark, spark-submit, etc) are working and I'm experiencing the same problem as you when trying to use Livy with HWC. But in my case, the Spark Zeppelin interpreter is also not working when I enable impersonation, which is needed to impose authorization restrictions to Hive data access with Ranger.

If you were able to make HWC work with Spark Interpreter in Zeppelin AND IMPERSONATION enabled; I would be very grateful if you could share the changes you have made in the interpreter's configuration to make this work.

The Zeppelin Spark Interpreter with impersonation disabled is working with HWC, but I NEED data access authorization, so this is not an option for me.

Best Regards

avatar
Explorer

Hi!

We use hive with llap, so "run as end user" = false.

Impersonalization enabled for livy interpeter.

We also use Ranger to manage permissions.


Services / Spark2 / Configs

Custom livy2-conf

livy.file.local-dir-whitelist = /usr/hdp/current/hive_warehouse_connector/ 
livy.spark.security.credentials.hiveserver2.enabled = true 
livy.spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/ 
livy.spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU 
livy.spark.yarn.security.credentials.hiveserver2.enabled = true 
livy.superusers = zeppelin-dwh_test 


Custom spark2-defaults

spark.datasource.hive.warehouse.load.staging.dir = /tmp
spark.datasource.hive.warehouse.metastoreUri = thrift://dwh-test-hdp-master03.COMPANY.ru:9083
spark.hadoop.hive.llap.daemon.service.hosts = @llap0
spark.hadoop.hive.zookeeper.quorum = dwh-test-hdp-master01.COMPANY.ru:2181,dwh-test-hdp-master02.COMPANY.ru:2181,dwh-test-hdp-master03.COMPANY.ru:2181
spark.history.ui.admin.acls = knox   
spark.security.credentials.hive.enabled = true
spark.security.credentials.hiveserver2.enabled = true
spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/
spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU 
spark.sql.hive.llap = true
spark.yarn.security.credentials.hiveserver2.enabled = true


Custom spark2-hive-site-override

hive.llap.daemon.service.hosts = @llap0 


/ Services / HDFS / Configs

You may also set this these values to asterisk for test if problem in delegation.

Custom core-site

hadoop.proxyuser.hive.groups *
hadoop.proxyuser.hive.hosts  *
hadoop.proxyuser.livy.groups *    
hadoop.proxyuser.livy.hosts  *
hadoop.proxyuser.zeppelin.hosts *
hadoop.proxyuser.zeppelin.groups *


Zeppelin

livy2 %livy2 Interpreter

Properties

name value

livy.spark.hadoop.hive.llap.daemon.service.hosts    @llap0
livy.spark.jars    file:/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar
livy.spark.security.credentials.hiveserver2.enabled    true
livy.spark.sql.hive.hiveserver2.jdbc.url    jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/
livy.spark.sql.hive.hiveserver2.jdbc.url.principal    hive/_HOST@COMPANY.RU
livy.spark.sql.hive.llap    true
livy.spark.submit.pyFiles    file:/usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip
livy.spark.yarn.security.credentials.hiveserver2.enabled    true
livy.superusers    livy,zeppelin
spark.security.credentials.hiveserver2.enabled    true
spark.sql.hive.hiveserver2.jdbc.url.principal    hive/_HOST@COMPANY.RU
zeppelin.livy.concurrentSQL    false
zeppelin.livy.displayAppInfo    true
zeppelin.livy.keytab    /etc/security/keytabs/zeppelin.server.kerberos.keytab
zeppelin.livy.maxLogLines    1000
zeppelin.livy.principal    zeppelin-dwh_test@COMPANY.RU
zeppelin.livy.pull_status.interval.millis    1000
zeppelin.livy.restart_dead_session    false
zeppelin.livy.session.create_timeout    120
zeppelin.livy.spark.sql.field.truncate    true
zeppelin.livy.spark.sql.maxResult    1000
zeppelin.livy.url    http://dwh-test-hdp-master02.COMPANY.ru:8999


Sample code for test:

%livy2

import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession._
val hive = HiveWarehouseSession.session(spark).build()

hive.showDatabases().show(100)


Ranger audit example:

106712-снимок-экрана-2019-02-26-в-105128.png


снимок-экрана-2019-02-26-в-105128.png