Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark and HIVE

avatar
Expert Contributor

Do Spark and HIVE have the abilities to set permissions?

For example, only people of a certain role can view a certain table. Also, only people of a certain role can view this column.

1 ACCEPTED SOLUTION

avatar

For Hive one would use Apache Ranger for this. You can allow or deny access to tables, columns and even rows.

Now, what to do with Spark:

For the normal HiveContext Spark would read the Schema from Metastore and then read the the file directly from HDFS. So no Hive Ranger plugin would kick in.

However, with LLAP it will be possible, see e.g. https://hortonworks.com/blog/sparksql-ranger-llap-via-spark-thrift-server-bi-scenarios-provide-row-c... If you additionally disable HDFS access for "others" for Hive tables, data is access controlled

View solution in original post

4 REPLIES 4

avatar

For Hive one would use Apache Ranger for this. You can allow or deny access to tables, columns and even rows.

Now, what to do with Spark:

For the normal HiveContext Spark would read the Schema from Metastore and then read the the file directly from HDFS. So no Hive Ranger plugin would kick in.

However, with LLAP it will be possible, see e.g. https://hortonworks.com/blog/sparksql-ranger-llap-via-spark-thrift-server-bi-scenarios-provide-row-c... If you additionally disable HDFS access for "others" for Hive tables, data is access controlled

avatar
Expert Contributor

If I create tables in SparkSQL, how to I enable fine-grained permissions for these tables?

Or is this only possible using HiveQL?

avatar

Fine grained permissions (row level, column masking, ...) are created in Ranger for any Hive table - whether created by HiveQL or SparkQL

So if you create a new table in Hive via SparkSQL that should be used by others with access control, you need to create the appropriate policies afterwards in Ranger.

For less fine grained permissions (delete update, insert delete) you can also use the SQL commands of https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBa... with SparkSQL

avatar
Expert Contributor

How do I create Hive tables in Spark?