Created 09-26-2016 06:06 PM
Could the row level filter setup on a group be dynamic for each member of the group?
Use case is like the following:
data analysts are all in a group calledDA. If a RLF policy is setup for the DA group but the goal is to limit each member of the group to only access data assigned to him/her. this could be done easily on per-user level policy, but if the member of the group is huge, it could be lots policies to manage. Is there a way to set it on the group level something like
AssigneeID = @userID
where @userID is associated with each member.
Created 10-03-2016 03:09 PM
After some help from Ranger PM, I was told this is possible by using the Hive UDF which could get the current user that's running Hive context. The function is current_user().
The solution would be to include the user information in the data or have a lookup table associate the user id with the user name.
then the role level filter would look like something like
UserName = current_user()
or
UserID in (select uid from userLookup where uname = current_user())
And I just recently punlished a HCC article detailing how to do it
Created 09-26-2016 06:15 PM
Filter conditions are static in Ranger 0.6, so there is no way to populate a variable like @userID dynamically. You would need to define groups as appropriate for each user's compliant access.
Filter conditions can reference other objects, but this doesn't help you much in this case, as you would need separate "filter tables" for each user which results in the same administrative overhead you are seeking to avoid.
By other objects, I mean other tables not tied to this specific policy. For instance, if I have normalized data model with a customer table and an address table, with an associated foreign key relationship, I can create a filter condition like customerID in (select customerID from customerAddress where state = 'TX').
Created 09-26-2016 06:33 PM
@slachterman, since you mentioned "Filter conditions can reference other objects", could you please give a couple examples. I could think of using time like
LastUpdateTime >= DATEADD(DAY, -30, GetDate())
What other kinds of object could be used?
Created 09-26-2016 06:44 PM
@Qi Wang by objects I am thinking of tables, see my updated answer. I haven't tested using a function in that way but as long as all the functions are deterministic (like GetDate()), that may work.
Created 10-03-2016 03:09 PM
After some help from Ranger PM, I was told this is possible by using the Hive UDF which could get the current user that's running Hive context. The function is current_user().
The solution would be to include the user information in the data or have a lookup table associate the user id with the user name.
then the role level filter would look like something like
UserName = current_user()
or
UserID in (select uid from userLookup where uname = current_user())
And I just recently punlished a HCC article detailing how to do it
Created 10-10-2016 11:17 PM
In addition to current_user() there are also other built-in udfs that you can use in the RLF conditions such as logged_in_user(), current_database() etc. . Please refer to the Hive Misc functions documentation on Apache wiki: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Misc.Functions