Support Questions
Find answers, ask questions, and share your expertise

Accessing Hive on Spark via HiveServer2 and impersonation support


Hi All,

In the secured cluster, we have a HiveServer2 started as "hive". When we connect to Hive via Beeline, Ranger policies (defined via Hive Ranger plugin) are respected as the control flow goes through HiveServer2. It verifies if the user has access to table x,y,z or not,etc. Since we have set impersonation set to false, the query gets executed as hive user, and hive user has more permission on HDFS and things work well in the case of Beeline-Hive scenario.

Now, with Spark, when the end user needs to connect to Hive using Spark-Shell or Python-shell for example, we see that the connections directly go to HiveMetaStore and not HS2, so Ranger does not play its part. Also all the queries are executed as the end user, obviously, the end user does not have permission to access the file directly on HDFS and the Spark-Hive fails.

I know its a pending issue, but for now we need a work around so that we can give end user permissions to access the data, without creating (or with minimally changing ) existing hdfs access permissions.

(This is on HDP 2.5.x)

Many thanks,




@Arpan Rajani

You can try to create a group for Spark user and give it read permission on HDFS location of hive table using ACL. This should let Spark user read data from Spark-shell.