Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Can un-authorized Hive column be masked or redacted? (instead of failing the query altogether)

avatar
Super Collaborator

This is a most common question that I get asked from the customers - who says that failing a hive query altogether makes no sense for them in the enterprise environment, but rather want to have it redacted.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Because this is the question that I get asked most often times from enterprise customers in the field, here is the answer:

Ranger currently supports resource based and tag based policies for Hive (HDFS files, HBase, etc...), where you can specify a column to be un-authorized for a specific user or user group. This will fail the query by that user/group altogether.

However, there is a work in progress to make queries involving the un-authorized columns to simply mask (redact) the data instead of failing altogether. Here is the jira number https://hortonworks.jira.com/browse/RMP-3705

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

Because this is the question that I get asked most often times from enterprise customers in the field, here is the answer:

Ranger currently supports resource based and tag based policies for Hive (HDFS files, HBase, etc...), where you can specify a column to be un-authorized for a specific user or user group. This will fail the query by that user/group altogether.

However, there is a work in progress to make queries involving the un-authorized columns to simply mask (redact) the data instead of failing altogether. Here is the jira number https://hortonworks.jira.com/browse/RMP-3705

avatar
Rising Star

@hduraiswamy Authorization and Masking are 2 separate events. You would need access to a column for the query to run. If customer would want to filter columns, best way would be to create views. This is no different than other databases.

If the user has access to column, but the column data should be redacted, then masking would be an appropriate solution.

avatar
New Contributor

There is an enterprise level high performance data masking for hive at datasunrise www.datasunrise.com/masking/hive/

,

There is an enterprise level high performance data masking for hive at datasunrise

www.datasunrise.com/masking/hive/