Support Questions

praneet_223 · ‎07-26-2016

Hi

I am running a Spark job on ranger enabled HDP cluster. This spark job reads from a hive table and writes to another hive table. What I am seeing is that the ranger hive policies are not being honored.

Is this the expected behavior of Spark job with ranger? Is spark supported with ranger?

LesterMartin · ‎07-26-2016

That sounds like all is working as designed/implemented since Ranger does not currently (as of HDP 2.4) have a supported plug-in for Spark and knowing that when spark is reading Hive tables that it really isn't going through the "front door" of Hive to actual run queries (it is reading these files from HDFS directly).

That said, the underlying HDFS authorization policies (either w/or w/o using Ranger) will be honored if they are in-place.

View solution in original post

LesterMartin · ‎07-26-2016

That sounds like all is working as designed/implemented since Ranger does not currently (as of HDP 2.4) have a supported plug-in for Spark and knowing that when spark is reading Hive tables that it really isn't going through the "front door" of Hive to actual run queries (it is reading these files from HDFS directly).

That said, the underlying HDFS authorization policies (either w/or w/o using Ranger) will be honored if they are in-place.

jay1ram2 · ‎11-17-2016

The workaround is to use SQLContext (instead of HiveContext) and JDBC to connect to HiveServer2 which will honor ranger's authorization policies.

Following links will give you some idea about how Spark, JDBC and SQLContext works.

http://stackoverflow.com/questions/32195946/method-not-supported-in-spark

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_dataintegration/content/hive-jdbc-odbc-d...

Cloudera Community

Support Questions

Does Spark job honor Ranger hive policies?