Created on 06-10-2017 06:34 AM - edited 09-16-2022 04:44 AM
What are the ideal / Minimum required ACL's that need to be applied on a HDFS directory containing Hive External Tables?
1. I have a directory '/user/devteam/custdata' with permissions 770.
hdfs dfs -getfacl /user/devteam/custdata
# file: /user/devteam/custdata
# owner: devteam
# group: devteam
user::rwx
group::rwx
other::---
2. I set ACL of...
hdfs dfs -setfacl -R -m group:hive:rwx,group:qateam:r-x /user/devteam/custdata
3. Sentry Roles
create role qateamrole;
grant select on database devdb to role qateamrole;
create role devteamrole;
grant all on database devdb to role devteamrole;
grant all on uri '/user/devteam/custdata' to role devteamteamrole;
By setting these two permissions with HDFS sentry sync enabled. Will I be able to run all my sqoop jobs and hive queries as the owner and qateam successfully? At the same time I want data to be visible only to the owner and teams that have permissions to query / cat / list them.
Thanks
Created on 06-12-2017 04:45 AM - edited 06-12-2017 04:46 AM
From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs :
hive:hive / 771
https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hiveserver2_security.html#concept_vxf_pgx_nm
https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_lvc_4g4_rp
Then it is the plugin that will manage the other permission according to permissions granted in Sentry.
If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin.
Created on 06-12-2017 04:45 AM - edited 06-12-2017 04:46 AM
From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs :
hive:hive / 771
https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hiveserver2_security.html#concept_vxf_pgx_nm
https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_lvc_4g4_rp
Then it is the plugin that will manage the other permission according to permissions granted in Sentry.
If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin.