- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Ranger security when accessing Hive tables via SparkSQL?
- Labels:
-
Apache HCatalog
-
Apache Ranger
-
Apache Spark
Created ‎12-08-2015 05:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do we manage authorization control over tables within SparkSQL?
Will ranger enforce existing Hive policies when these Hive tables are accessed via SparkSQL? If not, what is the recommended approach.
Created ‎12-08-2015 05:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TL:DR: SparkSQL today provides table level access control and doesn't provide Hive level (column) level access control.
Spark reads from both Hive Meta store and ORC (or PARQUET) files directly in HDF.
For ORC files, security at HDFS still applies so READ/WRITE is controlled by HDFS ACL or Ranger.
Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this.
Created ‎12-08-2015 05:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TL:DR: SparkSQL today provides table level access control and doesn't provide Hive level (column) level access control.
Spark reads from both Hive Meta store and ORC (or PARQUET) files directly in HDF.
For ORC files, security at HDFS still applies so READ/WRITE is controlled by HDFS ACL or Ranger.
Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this.
Created ‎12-08-2015 10:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What steps needs to be performed to get Table level access control via Spark using Ranger.
a) Do we enable the Hive Plugin?
It will be good if there is some notes or docs that shows the steps to enable Table level access control
If using in conjunction with Hue Livy server. Does anything change?
Created ‎12-15-2015 12:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"Right now Spark doesn't propagate end user identity to Hive meta store and we are working in the community to enhance this."
I assume Spark runs Hive queries as hive user. Does this mean that Spark has access to all data stored in Hive even though the Ranger plugin is active?
Created ‎12-16-2015 05:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The nuance is how you use SparkSQL. If you use Spark-Shell, the identity of user launching the shell is used for Hive access. If you use SparkThrift Server the identity used to access Hive data is the identity used to launch the SparkThriftServer.
Created ‎12-17-2015 11:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@vshukla is there any equivalent of hiveserver2 "doAs" in Spark Thrift Server?
Created ‎12-17-2015 03:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
STS today doesn't support doAs and there is an open ticket for it.
https://issues.apache.org/jira/browse/SPARK-5159
We plan to work in the community to resolve it.
Created ‎12-17-2015 03:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎03-01-2016 11:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
Can you please provide steps that needs to be performed to get Table level access control via Spark using Ranger ?
Thanks and Regards
Shyam
Created ‎06-08-2018 01:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does any one implemented ranger in spark sql? Any instructions on this is greatly helpful. I did the setup, but there is no impact of policies when a user is running queries. I am missing something there.
