Created on 11-20-2019 03:45 AM - edited 09-16-2022 01:45 AM
Apache Ranger is a framework to enable, monitor and manage comprehensive data security across the Big Data platform. Ranger allows you to centralize security administration to manage all security related tasks in a central UI or using REST APIs. You can setup fine grained authorization to do a specific action and/or operation with the platform's components/tools through role based and/or attribute based access control. Ranger also supports central auditing of all events across the cluster on access/edit/update/delete operations.
Ranger has support for the following components through plugins:
This article will provide an example of setting up Ranger database policies as well as column masking rules. The example will also show what happens when you have two tables, with one table having column masking rules applied for a specific user, and trying to join the two tables.
Scenario:
In this screenshot, you can see that user_sales and user_hr have been given permission to access table_a and table_b:
However, in the following screenshot, user_sales is being prevented from seeing the col_masked values by a masking policy in table_a:
As user_hr, I can see both tables:
select * from table_a;
+---------------+---------------------+
| table_a.name | table_a.col_masked |
+---------------+---------------------+
| Mark | 1 |
| Jane | 2 |
| Steve | 3 |
+---------------+---------------------+
select * from table_b;
+-------------------+-----------------------+
| table_b.location | table_b.col_unmasked |
+-------------------+-----------------------+
| London | 1 |
| Amsterdam | 2 |
| Paris | 3 |
+-------------------+-----------------------+
user_hr can also join both tables together on the shared column without any issue:
select a.*, b.* from table_a a, table_b b where a.col_masked = b.col_unmasked;
+---------+---------------+-------------+-----------------+
| a.name | a.col_masked | b.location | b.col_unmasked |
+---------+---------------+-------------+-----------------+
| Mark | 1 | London | 1 |
| Jane | 2 | Amsterdam | 2 |
| Steve | 3 | Paris | 3 |
+---------+---------------+-------------+-----------------+
As user_sales, I'm allowed to see everything in table_b, but I'm not allowed to see the masked column in table_a:
select * from table_a;
+---------------+---------------------+
| table_a.name | table_a.col_masked |
+---------------+---------------------+
| Mark | NULL |
| Jane | NULL |
| Steve | NULL |
+---------------+---------------------+
select * from table_b;
+-------------------+-----------------------+
| table_b.location | table_b.col_unmasked |
+-------------------+-----------------------+
| London | 1 |
| Amsterdam | 2 |
| Paris | 3 |
+-------------------+-----------------------+
When I try and join the table, Hive will prevent the join from happening as the joined columns don't match up:
+---------+---------------+-------------+-----------------+
| a.name | a.col_masked | b.location | b.col_unmasked |
+---------+---------------+-------------+-----------------+
+---------+---------------+-------------+-----------------+
When running the join operation in Hive, Hive will apply the masking operation before performing the join, thus preventing the rows from matching each other and generating a result set.