06-10-2017 06:34 AM - edited 06-10-2017 06:43 AM
What are the ideal / Minimum required ACL's that need to be applied on a HDFS directory containing Hive External Tables?
1. I have a directory '/user/devteam/custdata' with permissions 770.
hdfs dfs -getfacl /user/devteam/custdata
# file: /user/devteam/custdata
# owner: devteam
# group: devteam
2. I set ACL of...
hdfs dfs -setfacl -R -m group:hive:rwx,group:qateam:r-x /user/devteam/custdata
3. Sentry Roles
create role qateamrole;
grant select on database devdb to role qateamrole;
create role devteamrole;
grant all on database devdb to role devteamrole;
grant all on uri '/user/devteam/custdata' to role devteamteamrole;
By setting these two permissions with HDFS sentry sync enabled. Will I be able to run all my sqoop jobs and hive queries as the owner and qateam successfully? At the same time I want data to be visible only to the owner and teams that have permissions to query / cat / list them.
06-12-2017 04:45 AM - edited 06-12-2017 04:46 AM
From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs :
hive:hive / 771
Then it is the plugin that will manage the other permission according to permissions granted in Sentry.
If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin.