Created 05-02-2017 11:17 PM
hello - I've implemented Spark security using LLAP, and seeing error in specific scenario
Here is what is done ->
1 - I login to Spark Thrift server using user - spark
2 - created a Ranger policy which specifies that user - 'spark' does not have access to column - storekey in table - factsales
3 - fired query -> select count(1) from factsales;
Error is as shown below.
Pls. note - this seems to be happening when i try to get count of rows.
When i fire query to get values of specific columns, i get expected result (based on whether i have access to the column or not).
------------------------------- QUERY WHERE I GET ERROR (in getting count) --------------------
[alanka01@nwk2-bdp-hadoop-06 ~]$ beeline -u jdbc:hive2://nwk2-bdp-hadoop-08.gdcs-qa.apple.com:10015/default -n spark Connecting to jdbc:hive2://nwk2-bdp-hadoop-08.gdcs-qa.apple.com:10015/default Connected to: Spark SQL (version 1.6.2) Driver: Hive JDBC (version 1.2.1000.2.5.3.0-37) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 1.2.1000.2.5.3.0-37 by Apache Hive 0: jdbc:hive2://nwk2-bdp-hadoop-08.gdcs-qa.ap>
select count(1) from factsales;
Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: TungstenAggregate(key=[], functions=[(count(1),mode=Final,isDistinct=false)], output=[_c0#402L]) +- TungstenExchange SinglePartition, None +- TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#405L]) +- Scan LlapRelation(org.apache.spark.sql.hive.llap.LlapContext@32b7eb41,Map(table -> default.factsales, url -> jdbc:hive2://nwk2-bdp-hadoop-06.gdcs-qa.apple.com:10500))[] (state=,code=0)
-------------------------QUERIES WITH EXPECTED RESULT -1 -------------------------------------
0: jdbc:hive2://nwk2-bdp-hadoop-08.gdcs-qa.ap>
select saleskey from factsales limit 10;
+-----------+--+ | saleskey | +-----------+--+ | 3343549 | | 2822385 | | 2764012 | | 3289348 | | 2531906 | | 3055870 | | 2530527 | | 2880758 | | 2297049 | | 3356058 | +-----------+--+
-------------------------QUERIES WITH EXPECTED RESULT -2 (since user - spark does not have access to column - storekey) ----------------------------------------------------------------------------------------------------------
0: jdbc:hive2://nwk2-bdp-hadoop-09.gdcs-qa.ap>
select saleskey, storekey from factsales limit 10;
Error: java.io.IOException: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to compile query: org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException: Permission denied: user [spark] does not have [SELECT] privilege on [default/factsales/saleskey,storekey] (state=,code=0)
Created 05-03-2017 01:55 AM
To do count(*) you need select privilege on all table. You can still do "select count(column name) from <table name>" and that will work but to run count(*) you need to have select permissions for whole table. This is working as expected.
Created 05-02-2017 11:19 PM
@mqureshi -looping you in, any ideas ?
Created 05-03-2017 01:55 AM
To do count(*) you need select privilege on all table. You can still do "select count(column name) from <table name>" and that will work but to run count(*) you need to have select permissions for whole table. This is working as expected.
Created 05-04-2017 10:27 PM
@mqureshi - i guess what you mentioned makes sense, the error message, however, does not indicate the actual issue.