I have a question for the best practices should be done on cloudera cluster specially on hdfs level using sentry RBACS.the main concern for us is that the data is copied from external resources to hdfs and the permissions make headache for us as sentry cannot be applied on URI level.Si there any solutions or document for such cases to follow?
Hellio @AmroSaleh ,
thank you for reaching out on Community and raising your enquiry on Sentry-HDFS.
Have you seen the "Authorization with Apache Sentry" documentation, please?
For HDFS-Sentry synchronization to work, you must use the Sentry service, not policy file authorization. See Synchronizing HDFS ACLs and Sentry Permissions, for more details.
Let us know if you went through these docs and you still need any additional information.
Thanks Bender , yes i checked these documents and yes i configured sentry service , but the issue with sentry is that the HDFS ACLS will not be applied , for example if i have a user that needs to write to a specific path on hdfs as hive will manage everything i cannot add an ACL for this user and the grant with URI will be ALL.
Sentry-HDFS authorization is focused on Hive warehouse data - that is, any data that is part of a table in Hive or Impala.
For HDFS-only control, you should looke at HDFS ACLs or Extended ACLs.
See this doc.