We have Ranger installed on hadoop cluster and had to set doas to false, due to which all the hive jobs running from end users where shown as hive user in RM webui and in Ranger audit. Due to which, we are unable to do auditing.
We would like to set doas to true, are their any impacts on cluster or on any other services/components for setting doas to true.
Hi @Turing nix
I guess that now you only have one Ranger HDFS policy to authorize the hive user to access HDFS files in your dataware folder and manage all other users access through Hive policies?
If you set doas to true, access to Hive (Hive table and the underlying HDFS data) will be done with the end user identity. If the answer to my previous question is yes, then you will need to create HDFS Ranger policies for your users to authorize them to access to the HDFS data in your dataware folder in addition of existing Hive Ranger policies. This also means that column based policies are useless since the user can access to underlying files directly in HDFS.
There a Hive Hook that can be used to keep doas=false and present the user identity to Yarn but it's not supported in production. Take a look at this article: https://community.hortonworks.com/articles/24009/map-hive-jobs-to-yarn-queues.html