Security best practices when using Ranger dictate that Hive jobs should ideally run as user 'hive' so that only Ranger Hive policies apply for end user access to data, and letting 'hive' own all the directory/file structure for Hive on HDFS. This is achieved by using hive.server2.enable.doAs set to 'false'. It also allows to improve performance as it enables container pre-warming for Tez, as it is only applicable for those jobs started by 'hive', and not by other end users.
The problem introduced by doAs = false is that, if YARN Capacity Scheduler queue mappings have been defined on a user/group basis, the mappings will not apply since all the jobs will be started as the same user (i.e. 'hive'), making the queue definitions completely useless.
One solution could be to use a Hive hook that could detect the real user that started the query so that we could submit the job to the right queue even if it still runs as user 'hive'. Then, the hook could find the list of groups the user belongs to and try to match them with a group-mappings file (with the structure groupname:queuename). When it finds one of the user groups it will automatically submit the job to the matched queue.
This Hive hook is able to detect the user that started the hive session, find the groups that it belongs to, and send the job to the corresponding queue depending on that group and the mappings we define on the group-mappings file.
It is based on this other hook which will submit the job to a queue named as the primary user's group: