Created 05-14-2018 03:36 AM
If Ranger is doing query re-writes why does it need LLAP. Why isn't Spark + HDFS sufficient for the query filtering?
Created 05-14-2018 05:39 PM
Created 05-15-2018 04:41 AM
I read this article before, but the filtering and projection is already provided by Spark execution engine over HDFS. Why do we need LLAP in the middle only in context of row level security. (Note: I am not talking about performance benefits of LLAP in general here).
How about this?
USer submits query -> Ranger authorizes and modifies the query filters/projections-> the new query gets executed as normal Saprk SQL of HDFS (with no need of LLAP)
Created 05-15-2018 04:40 AM
I read this article before, but the filtering and projection is already provided by Spark execution engine over HDFS. Why do we need LLAP in the middle only in context of row level security. (Note: I am not talking about performance benefits of LLAP in general here).
How about this?
USer submits query -> Ranger authorizes and modifies the query filters/projections-> the new query gets executed as normal Saprk SQL of HDFS (with no need of LLAP)
Created 04-26-2020 10:11 AM
Indeed there is no need of using LLAP.
You could use this library for achieving what you are requesting without LLAP:
https://github.com/apache/submarine/tree/master/submarine-security/spark-security
It works between Spark and Ranger.