Support Questions

Find answers, ask questions, and share your expertise

Ranger KMS vs Ranger HDFS policies vs Ranger Hive policies

avatar
Contributor

I have a HDP kerberized cluster with Ranger enabled, where the data is encrypted by KMS with multiple encryption zones.
Users can only access data via Hive interface.

In order to access all the data I can choose on of the options:

1. Give hive user an access to specific HDFS folders along with the permission to decrypt the data. However if hive user gets compromised, it will have an access to all the data.

2. Enable doAs option in Hive, and access HDFS as end user. This however will require policies for user on both: hdfs and hive and if user has an access to hdfs (for some reason) the hive permissions on column level becomes useless.

What's the valid approach here?

1 ACCEPTED SOLUTION

avatar

@Jakub Igla

The answer is it depends on what your usecase is.

What I hear customers say, when they use doAs=true, is they like to have audit at hdfs operation level. But at the same time as you well said column level authorization isn't complete when doAs=true, since users have access to underlying data. It's a give and take, and personally I've seen both approaches being used in production.

HTH

View solution in original post

1 REPLY 1

avatar

@Jakub Igla

The answer is it depends on what your usecase is.

What I hear customers say, when they use doAs=true, is they like to have audit at hdfs operation level. But at the same time as you well said column level authorization isn't complete when doAs=true, since users have access to underlying data. It's a give and take, and personally I've seen both approaches being used in production.

HTH