Created on 06-30-2017 10:19 PM
This article is in continuation of this HCC article https://community.hortonworks.com/content/kbentry/101181/rowcolumn-level-security-in-sql-for-apache-....
One can take advantage of Row/Column level security of Spark via various Zeppelin interpreters as explained in the following table:
Interpreter name | Row/Column security feature supported? | Reason for no support |
% jdbc (with spark1.x STS) | Yes | |
% jdbc (with spark2 STS) | Yes | |
% livy.sql | No | Zeppelin’s livy interpreter won’t support Row/Column level security because it uses yarn-cluster mode and it needs delegation tokens to access HiveServer2 in yarn-cluster mode. This support is not present in Spark1.x |
% livy2.sql | Yes | |
% spark.sql | No | Zeppelin’s Spark interpreter group does not support user impersonation |
% spark2.sql | No | Zeppelin’s Spark interpreter group does not support user impersonation |
In this article, we will show how to configure Zeppelin’s livy2 and jdbc interpreters to take advantage of Row/Column level security feature provided by Spark in HDP 2.6.1.
livy.spark.sql.hive.llap = true livy.spark.hadoop.hive.llap.daemon.service.hosts = <value of hive.llap.daemon.service.hosts> livy.spark.jars = <HDFS path of spark2-llap jar> livy.spark.sql.hive.hiveserver2.jdbc.url = <hiveserver2 jdbc URL> livy.spark.sql.hive.hiveserver2.jdbc.url.principal = <value of hive.server2.authentication.kerberos.principal>
We can use Zeppelin’s jdbc interpreter to route sql queries to Spark1.x or Spark2 by configuring it to use Spark1.x thrift server when invoked with %jdbc(spark) and to use Spark2 thrift server when invoked with %jdbc(spark2)
spark.driver : org.apache.hive.jdbc.HiveDriver spark.url : <Spark1.x thrift server jdbc url> spark2.driver : org.apache.hive.jdbc.HiveDriver spark2.url : <Spark2 thrift server jdbc url>
For Spark2 with jdbc interpreter
For HDP 2.6.1 cluster, configure spark_thrift_cmd_opts in spark2-env as
--packages com.hortonworks.spark:spark-llap-assembly_2.11:1.1.2-2.1 --repositories http://repo.hortonworks.com/content/groups/public --conf spark.sql.hive.llap=true
(The above HCC article is written for HDP-2.6.0.3 and it suggests to set spark_thrift_cmd_opts in spark2-env as --packages com.hortonworks.spark:spark-llap-assembly_2.11:1.1.1-2.1 --repositories http://repo.hortonworks.com/content/groups/public --conf spark.sql.hive.llap=true)
For Spark 1.x with jdbc interpreter
For HDP-2.6.1 cluster, configure spark_thrift_cmd_opts in spark-env as
--packages com.hortonworks.spark:spark-llap-assembly_2.10:1.0.6-1.6 --repositories http://repo.hortonworks.com/content/groups/public --conf spark.sql.hive.llap=true
Created on 08-23-2017 07:16 AM
After following this quide, I still got Kerberos errors when trying to communicate with LLAP. Turns out that you also need to set livy.spark.yarn.security.credentials.hiveserver2.enabled=true in the Livy interpreter in Zeppelin to make it work.
Created on 03-14-2018 05:36 PM
This doesn't work for hdp 2.6.3