Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ranger and hive.server2.enable.doAs configuration

Ranger and hive.server2.enable.doAs configuration

Expert Contributor

After Ranger installation and enabled hive plugin for Ranger, one of the configuration it modified was set to hive.server2.enable.doAs=false. Right now all the jobs are running it as "hive" users. What is the reason it was recommended to change it to FALSE.

When we try to drop the table , it is throwing permission error even though we are logged in as "dwuser" but the Ranger considering it as "hive" user.

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=hive, access=WRITE, inode="/data/insight/dwuser":dwuser:dwuser:drwxr-xr-x

Is it any impact if we are changing it back to TRUE ?

5 REPLIES 5
Highlighted

Re: Ranger and hive.server2.enable.doAs configuration

@Anandha L Ranganathan

When doAs is set to false, then all queries are executed as the hive user. The hive user does not have access to the location where the table is located, /data/insights/dwuser. That is why you are seeing that message. You should be able to set doAs back to true without too many issues.

Because of the way that Hive interacts with HDFS, when doAs is set to true the user running the Hive query needs to have permissions defined properly in both HDFS as well as Hive via Ranger. This is typically not a problem for tables that are stored in user home directories as your example table. However when those tables are managed by Hive in the /user/hive/warehouse directory, you need to remember to grant user rights via HDFS to the table location for the specific tables. You often don't want to grant wide-open permissions to /user/hive/warehouse.

Highlighted

Re: Ranger and hive.server2.enable.doAs configuration

Expert Contributor

@Michael Young

Thanks for your reply. It was set to TRUE before Ranger installation. But during the Ranger installation that property was set to FALSE and it was recommended by Ambari. What was the reason it was set to FALSE ? Do you have any insight on that ? Also set to FALSE also leads to other problem of resource allocation in YARN etc etc..

I would be happy if someone could answer what is the impact if I changed it to TRUE.

Thanks in advance..

Highlighted

Re: Ranger and hive.server2.enable.doAs configuration

I'm not 100% positive why Ambari recommends setting to FALSE. As I indicated above, it is likely because of the extra Ranger polices that you need to create and manage for HDFS in addition to the Hive policies. These extra policies are not intuitive to users and it can generate a lot of confusion about why some access works and not others.

Setting it to TRUE will give you finer grained access control and auditing. It also ensure better resource management via YARN queues. The only major impact to setting it to true is this:

1. If you need to manage column level security in Hive by restricting columns, you still have to ensure the user has HDFS access to the data. The downside is the user now has HDFS access to the data which doesn't have any column level restrictions allowing the user to get access to data via HDFS that they may not have access to via Hive.

2. If you are not concerned with column-level restrictions, then there are no downsides to using doAs set to TRUE that I'm aware of.

3. To get proper YARN queue mapping, you need to set doAs to TRUE.

As an alternative, you can use a custom Hive hook to submit a username to get proper YARN Queue utilization. You can read more here: https://community.hortonworks.com/content/idea/9658/hive-support-capacity-scheduler-user-queue-mappi...

Highlighted

Re: Ranger and hive.server2.enable.doAs configuration

Explorer

@Anandha L Ranganathan

With Ranger enabled, we ensure the hive.server2.enable.doAs is set to “false” because permissions in the HDFS files related to Hive can be given only to “hive” users, and noone would be able to access HDFS files directly.

After issuing a hive query, if you check the Ranger Audit logs you will be able to see that the query is running as the original user (dwuser) while the related tasks in HDFS will be executed as the “hive” user.

Highlighted

Re: Ranger and hive.server2.enable.doAs configuration

@Michael Young,

This describes the different use cases and why you would want to have that set to false or true. http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/

Don't have an account?
Coming from Hortonworks? Activate your account here