Created 12-08-2015 05:42 AM
How do we manage authorization control over tables within SparkSQL?
Will ranger enforce existing Hive policies when these Hive tables are accessed via SparkSQL? If not, what is the recommended approach.
Created 12-08-2015 05:48 AM
TL:DR: SparkSQL today provides table level access control and doesn't provide Hive level (column) level access control.
Spark reads from both Hive Meta store and ORC (or PARQUET) files directly in HDF.
For ORC files, security at HDFS still applies so READ/WRITE is controlled by HDFS ACL or Ranger.
Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this.
Created 12-08-2015 05:48 AM
TL:DR: SparkSQL today provides table level access control and doesn't provide Hive level (column) level access control.
Spark reads from both Hive Meta store and ORC (or PARQUET) files directly in HDF.
For ORC files, security at HDFS still applies so READ/WRITE is controlled by HDFS ACL or Ranger.
Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this.
Created 12-08-2015 10:58 PM
What steps needs to be performed to get Table level access control via Spark using Ranger.
a) Do we enable the Hive Plugin?
It will be good if there is some notes or docs that shows the steps to enable Table level access control
If using in conjunction with Hue Livy server. Does anything change?
Created 12-15-2015 12:41 AM
"Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this."
I assume Spark runs Hive queries as hive user. Does this mean that Spark has access to all data stored in Hive even though the Ranger plugin is active?
Created 12-16-2015 05:41 PM
The nuance is how you use SparkSQL. If you use Spark-Shell, the identity of user launching the shell is used for Hive access. If you use SparkThrift Server the identity used to access Hive data is the identity used to launch the SparkThriftServer.
Created 12-17-2015 11:36 AM
@vshukla is there any equivalent of hiveserver2 "doAs" in Spark Thrift Server?
Created 12-17-2015 03:21 PM
STS today doesn't support doAs and there is an open ticket for it.
https://issues.apache.org/jira/browse/SPARK-5159
We plan to work in the community to resolve it.
Created 12-17-2015 03:22 PM
Created 03-01-2016 11:56 AM
Hi ,
Can you please provide steps that needs to be performed to get Table level access control via Spark using Ranger ?
Thanks and Regards
Shyam
Created 06-08-2018 01:00 AM
Does any one implemented ranger in spark sql? Any instructions on this is greatly helpful. I did the setup, but there is no impact of policies when a user is running queries. I am missing something there.