Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What should be the value set to "hive.server2.enable.do" for impersonation in hive?

Highlighted

What should be the value set to "hive.server2.enable.do" for impersonation in hive?

New Contributor

Am confused with proxy/impersonation concept in hive. kind request to clarify this concept, if possible with an example as i read more documents on it and which confused me more.

As per i know, below is the meaning of below property. So to allow impersonation/proxy in hive for below example, what needs to be set.

hive.server2.enable.do= True --> Run hive scripts as end user instead of Hive user.

hive.server2.enable.do = False --> All the jobs will run as hive user.

Example, below is my setup:

===========================

1. Create two groups. groupadd -g 1100 G1 groupadd -g 1101 G2

2. Create three users userA, userB and userAdmin.

useradd -m -g 1100 -u 1103 -G G1 -s /bin/bash userA

useradd -m -g 1101 -u 1104 -G G2 -s /bin/bash userB

useradd -m -g 1101 -u 1106 -G G1 -s /bin/bash userAdmin

3. Set password for all three users.

4. Create required directories in hdfs

hadoop fs -mkdir /user/userA

hadoop fs -mkdir /user/userB

hadoop fs -mkdir /user/userAdmin

hadoop fs -chown -R userA:G1 /user/userA

hadoop fs -chmod -R 750 /user/userA

hadoop fs -chown -R userB:G2 /user/userB

hadoop fs -chmod -R 750 /user/userB

hadoop fs -chown -R userAdmin:G1 /user/userAdmin

5. Create required tables.

su - userA

kinit userA

hive drop table if exists mkf_Vision_Notes_a2;

create table mkf_Vision_Notes_a2( cust_id string) row format delimited fields terminated by '|' stored as TEXTFILE location '/user/userA/mkf_vision_notes_a2'; quit; logout

su - userB

kinit userB

hive drop table if exists mkf_Vision_Notes_b2;

create table mkf_Vision_Notes_b2( cust_id string) row format delimited fields terminated by '|' stored as TEXTFILE location '/user/userB/mkf_vision_notes_b2';

6. Set proxy user with following properties in core-site.xml and then restart hadoop services.

<property>

<name>hadoop.proxyuser.userA.groups</name>

<value>G2</value>

</property>

<property>

<name>hadoop.proxyuser.userA.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.userB.groups</name>

<value>G1</value>

</property>

<property>

<name>hadoop.proxyuser.userB.hosts</name>

<value>*</value>

</property>

=============================

From the above setup, so does that mean userA can now access the tables of userB. As userA is impersonating G2(userB is a member of G2), which mean when UserA trying to access userB table'mkf_Vision_Notes_b2', will it impersonate the privileges of userB for accessing the table.

And does it implies same for UserB too, will UserB can access UserA tables as it is impersonating G1 and when UserA executing it should be running with userA privileges.

Is this the concept of impersonation, kindly correct if am wrong and if not, if i want to achieve what should be the properties set.

And for impersonation, what should be the value for 'hive.server2.enable.do'.

1 REPLY 1

Re: What should be the value set to "hive.server2.enable.do" for impersonation in hive?

Super Guru
@rakesh kumar

everything was good in your setup until your following assumption:

From the above setup, so does that mean userA can now access the tables of userB. As userA is impersonating G2(userB is a member of G2), 

And does it implies same for UserB too, will UserB can access UserA tables as it is impersonating G1 and when UserA executing it should be running with userA privileges.

So let's take the following snippet first from your core-site.xml

<property>
<name>hadoop.proxyuser.userA.groups</name>
<value>G2</value>
</property>
<property>
<name>hadoop.proxyuser.userA.hosts</name>
<value>*</value>
</property>

What this is saying is that UserA can connect from any host (*) and impersonate users belonging to G2. That means If I am userB, then UserA should be able to impersonate me. What this means? It means, UserA is the one whose keytab will be used. Following would be possible based on above snippet:

mqureshi$ su userB
userB$ kinit userA
userB$ <run my hive query>

Vice versa for your userB impersonating members of G1. So at the end, given your two settings, UserA will be able to impersonate members of G1 and UserB will be able to impersonate members of G2.

now the fun part. All above that you do in core-site.xml is done for hadoop. For hive, you don't need all that. You do need hive.server2.enable.do which is set to true by default.

What you need for hive impersonation is something very simple.

"jdbc:hive2://<hiveserver2 host>:10000/;principal=hive/HiveServer2Host@YOUR-REALM.COM;hive.server2.proxy.user=<User belonging in G2>"

And your principal "hive" is already configured in hive-site.xml.

All the other settings are for Core Hadoop impersonation, like map reduce.

Don't have an account?
Coming from Hortonworks? Activate your account here