Created on 04-10-2018 11:32 AM - edited 09-16-2022 06:05 AM
Hi,
One of the prerequisites for enabling Sentry is disabling impersonation, which basically means that all queries are executed by user hive, not the user that actually executed the command in Hue. However, this requirement invalidates yarn queue placement, which so far used user's group name to select proper queue, and now assigns all jobs to hive's group instead. It's possible to avoid this behaviour by setting 'Specified' placement policy on top - this way we are able to ensure that job will be assigned to the queue based on the user's group, not the default group for hive. However, this solution opens possibility for some intentional misuse - it's possible that the user sets the mapred.job.queue.name parameter in Hue's session and circumvents yarn queue placement, potentially using more cluster resources than administrator intended to give them.
We've tried to use parameter
hive.security.authorization.sqlstd.confwhitelist
to prevent users from setting mapred.job.queue.name but apparently it requires enabling Standard Hive Authorization
whereas Cloudera doesn't support using native Hive authorization frameworks. Anyway, we failed to configure Standard Hive Authorization and Sentry at the same time in our CDH 5.9.3. Morever, the https://www.cloudera.com/documentation/enterprise/5-9-x/topics/sg_sentry_service_config.html clarifies that we are not even able to use
hive.security.command.whitelist
to block using 'set' command at all. Taking this all together, we'd like to ask:
1. if using 'Specified' placement policy in Yarn queue configuration is the only way to activate mapping based on group of user logged in Hue, not the hive user?
2. do you know other ways to block setting specific variables in Hive session?
3. is it possible to configure Standard Hive Authorization in CDH at all?
I'll be thankful for any clue. Thanks!
Created 04-13-2018 08:38 AM
Created 04-13-2018 08:38 AM