Created on 10-03-201605:14 PM - edited 08-17-201909:05 AM
Most user access control is accomplished on the group level. One of the example mentioned in Ranger release is to control each doctor only able to view the patients of his/her own. However most example provided up-to-date are mostly static filter condition that will not change for individual user. In this tutorial, we would demonstrate how to create a row level filter that applies to a group but takes different effect on the user level by leveraging UDF.
The use case is as following: a patient table contains patient information as well as the id of the doctor treating the patient (this is a simplified example, in really world the relation may be a many-to-many relation, but it could a handled by the same approach). We also have a doctor reference table where the doctor login credential could be found. We will use the login credential to identify the doctor and only show the patient that associated with the doctor.
To setup the environment, first download the HDP2.5 Sandbox form Hortonworks website, then follow the instruction to setup the sandbox.
Then login to Ambari through 127.0.0.1:8080 using admin/admin and setup the data using the following script in Hive View.
create table patients (ssn string, name string, doctor_id int) stored as orc;
create table doctorRef (id int, username string) stored as orc;
insert into doctorRef values (1, 'raj_ops'), (2, 'holger_gov');
insert into patients values
('111-11-1111', 'John Doo', 1),
('222-22-2222', 'Amy Long', 1),
('333-33-3333', 'Muni White', 2),
('444-44-4444', 'Kerry Chang', 2),
('555-55-5555', 'Holy Ng', 1);
Then we will setup the dynamic row level filter inside Ranger. Login to Ranger at 127.0.0.1:6080 using admin/admin, then go to Access Manger -> Resource Based Policies, under Hive, click Sandbox_Hive. Then click Row Level Filter
Add new policy like the following
Policy Type: row level filter
Policy Name: dynamic row level filter
Hive database: default
Hive table: patients
Audit logging: yes
Select Group: public
Access Type: select
Row level filter: doctor_id in (select id from doctorRef where username = current_user())
Then click save.
Now we could test the row level filter policy by login to Ambari 127.0.0.1:8080 as raj_ops/raj_ops, and only the patients with the doctor_id 1 is showing up.
When we use UDF current_user() that identify the user in current hive session, we could apply row level filter in group level that could have different effect on each user. This is particularly useful in the environment where large number of users are managed through group policy and users move between groups quite frequently. So no access control need to be done on user level, rather policy could be defined on group level and then through the dynamic row level filter, it will be applied to user level in a dynamic fashion.