In the secured cluster, we have a HiveServer2 started as "hive". When we connect to Hive via Beeline, Ranger policies (defined via Hive Ranger plugin) are respected as the control flow goes through HiveServer2. It verifies if the user has access to table x,y,z or not,etc. Since we have set impersonation set to false, the query gets executed as hive user, and hive user has more permission on HDFS and things work well in the case of Beeline-Hive scenario.
Now, with Spark, when the end user needs to connect to Hive using Spark-Shell or Python-shell for example, we see that the connections directly go to HiveMetaStore and not HS2, so Ranger does not play its part. Also all the queries are executed as the end user, obviously, the end user does not have permission to access the file directly on HDFS and the Spark-Hive fails.
I know its a pending issue, but for now we need a work around so that we can give end user permissions to access the data, without creating (or with minimally changing ) existing hdfs access permissions.