- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Livy and Hive Warehouse Connector with Kerberos
Created 12-26-2018 10:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have configured Spark/Zeppelin as described here:
Spark and Spark interpreter in Zepeplin works well. But Livy is not.
Here is stacktrace from Zepeplin:
Caused by: java.sql.SQLException: Cannot create PoolableConnectionFactory (Could not open client transport for any of the Server URI's in ZooKeeper: Peer indicated failure: Unsupported mechanism type PLAIN) at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2291) at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2038) at org.apache.commons.dbcp2.BasicDataSource.getLogWriter(BasicDataSource.java:1588) at org.apache.commons.dbcp2.BasicDataSourceFactory.createDataSource(BasicDataSourceFactory.java:588) at com.hortonworks.spark.sql.hive.llap.JDBCWrapper.getConnector(HS2JDBCWrapper.scala:333) at com.hortonworks.spark.sql.hive.llap.JDBCWrapper.getConnector(HS2JDBCWrapper.scala:340) at com.hortonworks.spark.sql.hive.llap.DefaultJDBCWrapper.getConnector(HS2JDBCWrapper.scala) at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.lambda$new$0(HiveWarehouseSessionImpl.java:48) at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.execute(HiveWarehouseSessionImpl.java:66) ... 12 more Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Peer indicated failure: Unsupported mechanism type PLAIN at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:333) at shadehive.org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) at org.apache.commons.dbcp2.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:39) at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:256) at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:2301) at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2287) ... 20 more Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: Unsupported mechanism type PLAIN at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at shadehive.org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:420) at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:301) ... 25 more
And here is container logs:
... 18/12/26 22:56:17 INFO ClientCnxn: Socket connection established, initiating session, client: /169.254.4.10:36348, server: hdp-master02.hadoop.local/169.254.4.8:2181 18/12/26 22:56:17 INFO ClientCnxn: Session establishment complete on server hdp-master02.hadoop.local/169.254.4.8:2181, sessionid = 0x267ea6bea4a00a1, negotiated timeout = 60000 18/12/26 22:56:17 INFO ConnectionStateManager: State change: CONNECTED 18/12/26 22:56:17 INFO CuratorFrameworkImpl: backgroundOperationsLoop exiting 18/12/26 22:56:17 INFO ZooKeeper: Session: 0x267ea6bea4a00a1 closed 18/12/26 22:56:17 INFO ClientCnxn: EventThread shut down 18/12/26 22:56:17 WARN HiveConnection: Failed to connect to hdp-master03.hadoop.local:10500 18/12/26 22:56:17 INFO CuratorFrameworkImpl: Starting 18/12/26 22:56:17 INFO ZooKeeper: Initiating client connection, connectString=hdp-master01.hadoop.local:2181,hdp-master02.hadoop.local:2181,hdp-master03.hadoop.local:2181 sessionTimeout=60000 watcher=shadecurator.org.apache.curator.ConnectionState@64f34919 18/12/26 22:56:17 INFO ClientCnxn: Opening socket connection to server hdp-master01.hadoop.local/169.254.4.7:2181. Will not attempt to authenticate using SASL (unknown error) 18/12/26 22:56:17 INFO ClientCnxn: Socket connection established, initiating session, client: /169.254.4.10:55506, server: hdp-master01.hadoop.local/169.254.4.7:2181 18/12/26 22:56:17 INFO ClientCnxn: Session establishment complete on server hdp-master01.hadoop.local/169.254.4.7:2181, sessionid = 0x167ea6be68f00be, negotiated timeout = 60000 18/12/26 22:56:17 INFO ConnectionStateManager: State change: CONNECTED 18/12/26 22:56:17 INFO CuratorFrameworkImpl: backgroundOperationsLoop exiting 18/12/26 22:56:17 INFO ZooKeeper: Session: 0x167ea6be68f00be closed 18/12/26 22:56:17 INFO ClientCnxn: EventThread shut down 18/12/26 22:56:17 ERROR Utils: Unable to read HiveServer2 configs from ZooKeeper
Environment variables from app container:
spark.security.credentials.hiveserver2.enabled true spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@hadoop.local spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://hdp-master01.hadoop.local:2181,hdp-master02.hadoop.local:2181,hdp-master03.hadoop.local:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive livy.spark.yarn.security.credentials.hiveserver2.enabled true
Can anyone help me?
Created 12-26-2018 11:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Following errors was in livy server log "livy-livy-server.out":
18/12/27 00:51:58 INFO LineBufferedStream: stdout: 18/12/27 00:51:58 WARN HiveServer2CredentialProvider: Failed to get HS2 delegation token ... 18/12/27 00:51:58 INFO LineBufferedStream: stdout: Caused by: shadehive.org.apache.hive.service.cli.HiveSQLException: Error retrieving delegation token for user ... 18/12/27 00:51:58 INFO LineBufferedStream: stdout: Caused by: org.apache.hadoop.security.authorize.AuthorizationException: Unauthorized connection for super-user: hive/hdp-master03.hadoop.local@HADOOP.LOCAL from IP 169.254.4.8
Problem: livy host was not in the hadoop.proxyuser.hive.hosts variable of core-site.xml.
Solution: add host with livy server to hadoop.proxyuser.hive.hosts
Created 02-26-2019 03:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You where able to make Hive Warehouse Connector work with Kerberos in the Spark Zeppelin interpreter when using USER IMPERSONATION, or only running with the "zeppelin" user??
In my case al the other Spark client interfaces (spark-shell, pyspark, spark-submit, etc) are working and I'm experiencing the same problem as you when trying to use Livy with HWC. But in my case, the Spark Zeppelin interpreter is also not working when I enable impersonation, which is needed to impose authorization restrictions to Hive data access with Ranger.
If you were able to make HWC work with Spark Interpreter in Zeppelin AND IMPERSONATION enabled; I would be very grateful if you could share the changes you have made in the interpreter's configuration to make this work.
The Zeppelin Spark Interpreter with impersonation disabled is working with HWC, but I NEED data access authorization, so this is not an option for me.
Best Regards
Created on 02-26-2019 02:12 PM - edited 08-17-2019 03:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
We use hive with llap, so "run as end user" = false.
Impersonalization enabled for livy interpeter.
We also use Ranger to manage permissions.
Services / Spark2 / Configs
Custom livy2-conf
livy.file.local-dir-whitelist = /usr/hdp/current/hive_warehouse_connector/ livy.spark.security.credentials.hiveserver2.enabled = true livy.spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/ livy.spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU livy.spark.yarn.security.credentials.hiveserver2.enabled = true livy.superusers = zeppelin-dwh_test
Custom spark2-defaults
spark.datasource.hive.warehouse.load.staging.dir = /tmp spark.datasource.hive.warehouse.metastoreUri = thrift://dwh-test-hdp-master03.COMPANY.ru:9083 spark.hadoop.hive.llap.daemon.service.hosts = @llap0 spark.hadoop.hive.zookeeper.quorum = dwh-test-hdp-master01.COMPANY.ru:2181,dwh-test-hdp-master02.COMPANY.ru:2181,dwh-test-hdp-master03.COMPANY.ru:2181 spark.history.ui.admin.acls = knox spark.security.credentials.hive.enabled = true spark.security.credentials.hiveserver2.enabled = true spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/ spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU spark.sql.hive.llap = true spark.yarn.security.credentials.hiveserver2.enabled = true
Custom spark2-hive-site-override
hive.llap.daemon.service.hosts = @llap0
/ Services / HDFS / Configs
You may also set this these values to asterisk for test if problem in delegation.
Custom core-site
hadoop.proxyuser.hive.groups * hadoop.proxyuser.hive.hosts * hadoop.proxyuser.livy.groups * hadoop.proxyuser.livy.hosts * hadoop.proxyuser.zeppelin.hosts * hadoop.proxyuser.zeppelin.groups *
Zeppelin
livy2 %livy2 Interpreter
Properties
name value
livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0 livy.spark.jars file:/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar livy.spark.security.credentials.hiveserver2.enabled true livy.spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/ livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@COMPANY.RU livy.spark.sql.hive.llap true livy.spark.submit.pyFiles file:/usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip livy.spark.yarn.security.credentials.hiveserver2.enabled true livy.superusers livy,zeppelin spark.security.credentials.hiveserver2.enabled true spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@COMPANY.RU zeppelin.livy.concurrentSQL false zeppelin.livy.displayAppInfo true zeppelin.livy.keytab /etc/security/keytabs/zeppelin.server.kerberos.keytab zeppelin.livy.maxLogLines 1000 zeppelin.livy.principal zeppelin-dwh_test@COMPANY.RU zeppelin.livy.pull_status.interval.millis 1000 zeppelin.livy.restart_dead_session false zeppelin.livy.session.create_timeout 120 zeppelin.livy.spark.sql.field.truncate true zeppelin.livy.spark.sql.maxResult 1000 zeppelin.livy.url http://dwh-test-hdp-master02.COMPANY.ru:8999
Sample code for test:
%livy2 import com.hortonworks.hwc.HiveWarehouseSession import com.hortonworks.hwc.HiveWarehouseSession._ val hive = HiveWarehouseSession.session(spark).build() hive.showDatabases().show(100)
Ranger audit example: