Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ranger: is it possible to grant access to a selection of columns of an external Hive table, without access to the HDFS files?

avatar
Expert Contributor

I have a couple of external Hive tables of which I need to gain a group of users access to only the non sensitive columns with Ranger (HDP 2.6.3). But during some tests with a testuser I found that he can only access these non sensitive columns if he has access to the path on HDFS. The HDFS path is secured by Ranger as well. I've set the HDFS permissions to no access at all.

Needing HDFS access for those users would defeat te purpose of granting access to a selection of columns. Because the HDFS file would contain all the sensitive data as well.

Is there a way to protect both the HDFS files and grant access to a selection of columns?

Example create table statement:

CREATE EXTERNAL TABLE berth_data
(`MUTATION_TYPE` string,
`D_MUTATION` string,
`T_MUTATION` string)
ROW FORMAT DELIMITED  
FIELDS TERMINATED BY '\t'  
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION  '/data/production/sensitive/berth_data';
1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Expert Contributor

@Felix Albani Thanks for that answer. Looks like I stand for an interesting choice:

Change hive.server2.enable.doAs=true and run Hive on HDFS as HiveServer2 process. But then I can restrict access to columns to users in Hive, without them getting access to the HDFS files. So the choice of Hive permissions I make will be much more important.

Keep hive.server2.enable.doAs=false and I will not be able to do column based access in Hive. But be in the comfort that if someone gets access to Hive table without the HDFS access, they still can not get to the data.

I'll have to think about this.