Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does Spark job honor Ranger hive policies?

Solved Go to solution
Highlighted

Does Spark job honor Ranger hive policies?

New Contributor

Hi

I am running a Spark job on ranger enabled HDP cluster. This spark job reads from a hive table and writes to another hive table. What I am seeing is that the ranger hive policies are not being honored.

Is this the expected behavior of Spark job with ranger? Is spark supported with ranger?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Does Spark job honor Ranger hive policies?

That sounds like all is working as designed/implemented since Ranger does not currently (as of HDP 2.4) have a supported plug-in for Spark and knowing that when spark is reading Hive tables that it really isn't going through the "front door" of Hive to actual run queries (it is reading these files from HDFS directly).

That said, the underlying HDFS authorization policies (either w/or w/o using Ranger) will be honored if they are in-place.

View solution in original post

2 REPLIES 2
Highlighted

Re: Does Spark job honor Ranger hive policies?

That sounds like all is working as designed/implemented since Ranger does not currently (as of HDP 2.4) have a supported plug-in for Spark and knowing that when spark is reading Hive tables that it really isn't going through the "front door" of Hive to actual run queries (it is reading these files from HDFS directly).

That said, the underlying HDFS authorization policies (either w/or w/o using Ranger) will be honored if they are in-place.

View solution in original post

Re: Does Spark job honor Ranger hive policies?

Explorer

The workaround is to use SQLContext (instead of HiveContext) and JDBC to connect to HiveServer2 which will honor ranger's authorization policies.

Following links will give you some idea about how Spark, JDBC and SQLContext works.

http://stackoverflow.com/questions/32195946/method-not-supported-in-spark

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_dataintegration/content/hive-jdbc-odbc-d...

Don't have an account?
Coming from Hortonworks? Activate your account here