- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark and HIVE
- Labels:
-
Apache Hive
-
Apache Spark
Created 02-23-2017 04:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do Spark and HIVE have the abilities to set permissions?
For example, only people of a certain role can view a certain table. Also, only people of a certain role can view this column.
Created 02-23-2017 04:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For Hive one would use Apache Ranger for this. You can allow or deny access to tables, columns and even rows.
Now, what to do with Spark:
For the normal HiveContext Spark would read the Schema from Metastore and then read the the file directly from HDFS. So no Hive Ranger plugin would kick in.
However, with LLAP it will be possible, see e.g. https://hortonworks.com/blog/sparksql-ranger-llap-via-spark-thrift-server-bi-scenarios-provide-row-c... If you additionally disable HDFS access for "others" for Hive tables, data is access controlled
Created 02-23-2017 04:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For Hive one would use Apache Ranger for this. You can allow or deny access to tables, columns and even rows.
Now, what to do with Spark:
For the normal HiveContext Spark would read the Schema from Metastore and then read the the file directly from HDFS. So no Hive Ranger plugin would kick in.
However, with LLAP it will be possible, see e.g. https://hortonworks.com/blog/sparksql-ranger-llap-via-spark-thrift-server-bi-scenarios-provide-row-c... If you additionally disable HDFS access for "others" for Hive tables, data is access controlled
Created 02-23-2017 06:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I create tables in SparkSQL, how to I enable fine-grained permissions for these tables?
Or is this only possible using HiveQL?
Created 02-24-2017 07:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fine grained permissions (row level, column masking, ...) are created in Ranger for any Hive table - whether created by HiveQL or SparkQL
So if you create a new table in Hive via SparkSQL that should be used by others with access control, you need to create the appropriate policies afterwards in Ranger.
For less fine grained permissions (delete update, insert delete) you can also use the SQL commands of https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBa... with SparkSQL
Created 02-24-2017 02:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do I create Hive tables in Spark?