Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Querying a Hive Table (via Hiveserver2) with Column Masking enabled via Ranger Hive Plugin returns with an error.

avatar
Explorer

I am running a HDP 2.5.3 cluster with HiveServer2 enabled on an edge node. Ranger is enforcing security. In addition, I've created a policy in ranger that redacts a certain column with pii on select.

12247-y2sye.png

Unfortunately, when I execute the following query in a hive client:

select * from <tablename> limit 100;

I get the following error:

[42000]: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:62 rule Identifier failed predicate: {allowQuotedId()}? line 1:74 rule Identifier failed predicate: {allowQuotedId()}? line 1:94 rule Identifier failed predicate: {allowQuotedId()}? line 1:117 rule Identifier failed predicate: {allowQuotedId()}?

If I disable the above policy, the query runs successfully.

10 REPLIES 10

avatar
Expert Contributor

which user you are using to run the query ? only the allowed users can do the select * because it includes the the column in the policy.

avatar
Explorer

I am logged-in as a user that is in the "select user". So yes, this user has permissions to run the query.

avatar
Rising Star

@Navendu Garg: I tried your scenario in my test environment and had no issues accessing column with masking option set to 'redact'. On giving select cmd, the values were shown as xxxxx instead of usual values.

Also, based on https://issues.apache.org/jira/browse/HIVE-6013, I inserted values into my column with ` and ', also set column name with ` and ' but the behavior was fine.

What hive client are you using?

Also, can you let me know what kind of value was in your column? Did the column name or any of column values contain any kind of quotes?

I am assuming that you have an access level hive policy granting ur user atleast select privilege to the table and the columns you are accessing in ur select statement, and then a masking policy which masks one of the columns for the same user running the query.

avatar
Explorer
I've tried it on SQL Workbench and Aginity Workbench, and I get errors on both clients.
Column is a varchar(9).

Here is what I discovered. A table with column names that have spaces (e.g. `account year`), this feature fails. When I added an underscore to the column (account_year), the masking started to work.

avatar
Rising Star

I tried the above scenario and @Navendu Garg's exact issue is described below:

There is a table say 'default.testtable1' with one of the column names containing a space say 'account name'.

[create table testtable1(Name String,`Account number` String,Age int);

...log/webhcat/webhcat.log.2017-02-14:hive.support.quoted.identifiers=column]

A user say 'user1' has a Hive access policy granting all permissions to user1 on default.testtable1, all columns.

There is also a masking policy which masks column 'name' in same table 'default.testtable1' with mask type='redact' for user1.

select * from default.testtable1 throws:

Error: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:41 rule Identifier failed predicate: {allowQuotedId()}?
line 1:56 rule Identifier failed predicate: {allowQuotedId()}?
line 1:129 rule Identifier failed predicate: {allowQuotedId()}?
line 1:144 rule Identifier failed predicate: {allowQuotedId()}? (state=42000,code=40000)

Audit shows only masking policy allowed the txn. There is no deny and no access policy mentioned in the audit. The flow didn't reach the access policies auth.

On disabling the masking policy, the select shows correct results.

NOTE: I tried it with mask type as 'Hash','Partial mask','Nullify' and same result.

@Madhan Neethiraj, Will you pls confirm if this is expected - If any of the column names in a table contain spaces and there is a mask policy on any of that table's columns, then select with fail with the above exception.

avatar
Rising Star

I think Hive folks should look into this one.

CC: @Eugene Koifman, @Wei Zheng

avatar
Super Collaborator

@Navendu Garg, could you share your table definition (show create table T)

avatar
Rising Star

Hi @Eugene Koifman: I was able to reproduce the exact scenario that Navendu mentioned and table create cmd I used was:

[create table testtable1(Name String,`Account number` String,Age int);

...log/webhcat/webhcat.log.2017-02-14:hive.support.quoted.identifiers=column]

avatar
New Contributor

I think its a bug in ranger, where quoted.indentifiers are not being honoured.

To reproduce:

  • set hive.support.sql11.reserved.keywords=false
  • create test-user who has restricted access to some rows in ranger (row filter) say, bangalore filter for service_city column.
  • create a table with one of the column is reserved word like "date"
select date from temp_db.test limit 10; //or
select `date` from temp_db.test limit 10;
  • Resolved query by Ranger:
select date from (SELECT *, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `temp_db`.`test`  WHERE lower(service_city)='bangalore')`test` limit 10 
  • Exception:
2017-10-09 13:28:27,517 ERROR [HiveServer2-Handler-Pool: Thread-60373]: ql.Driver (SessionState.java:printError(993)) - FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:30 Failed to recognize predicate 'date'. Failed rule: 'identifier' in expression specification org.apache.hadoop.hive.ql.parse.SemanticException: org.apache.hadoop.hive.ql.parse.ParseException: line 1:30 Failed to recognize predicate 'date'. Failed rule: 'identifier' in expression specification at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10321) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10436) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1221) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1215) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:264) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:470) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:457) at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy39.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:313) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.parse.ParseException: line 1:30 Failed to recognize predicate 'date'. Failed rule: 'identifier' in expression specification at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:205) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10319) 


Workaround:

>hive.support.quoted.identifiers=none;

Fire query> select `date` from temp_db.test limit 10;

Note: select * works without any issues even though the table has columns with reserved words.