Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Use of "hadoop.proxy" properties for hadoop components.

Solved Go to solution

Use of "hadoop.proxy" properties for hadoop components.

Explorer

Could anyone kindly explain the below "hadoop.proxy" properties set in core-site.xml for all the hadoop components in cluster. Why should this properties were been and what happends when this properties were been removed. ==================================

## grep -C3 hadoop.proxy core-site.xml

<property>

<name>hadoop.proxyuser.falcon.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.falcon.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hbase.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hbase.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hcat.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hcat.hosts</name>

<value>host01</value>

</property>

<property>

<name>hadoop.proxyuser.hdfs.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hdfs.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hive.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hive.hosts</name>

<value>host01</value>

</property>

<property>

<name>hadoop.proxyuser.HTTP.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.HTTP.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hue.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hue.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.oozie.groups</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.oozie.hosts</name>

<value>hosts01</value>

</property>

==================================

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Use of "hadoop.proxy" properties for hadoop components.

Proxy user - Superusers Acting On Behalf Of Other Users

A superuser with username ‘super’ wants to submit job and access hdfs on behalf of a user joe. The superuser has kerberos credentials but user joe doesn’t have any. The tasks are required to run as user joe and any file accesses on namenode are required to be done as user joe. It is required that user joe can connect to the namenode or job tracker on a connection authenticated with super’s kerberos credentials. In other words super is impersonating the user joe.

Some products such as Apache Oozie need this.

Configurations

You can configure proxy user using properties hadoop.proxyuser.$superuser.hosts along with either or both of hadoop.proxyuser.$superuser.groups and hadoop.proxyuser.$superuser.users.

By specifying as below in core-site.xml, the superuser named super can connect only from host1 and host2 to impersonate a user belonging to group1 and group2.

   <property>
     <name>hadoop.proxyuser.super.hosts</name>
     <value>host1,host2</value>
   </property>
   <property>
     <name>hadoop.proxyuser.super.groups</name>
     <value>group1,group2</value>
   </property> 

If these configurations are not present, impersonation will not be allowed and connection will fail.

If more lax security is preferred, the wildcard value * may be used to allow impersonation from any host or of any user. For example, by specifying as below in core-site.xml, user named oozie accessing from any host can impersonate any user belonging to any group.

<property>
    <name>hadoop.proxyuser.oozie.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.oozie.groups</name>
    <value>*</value>
  </property>

More details in below Apache Documentation:

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Superusers.html

View solution in original post

1 REPLY 1
Highlighted

Re: Use of "hadoop.proxy" properties for hadoop components.

Proxy user - Superusers Acting On Behalf Of Other Users

A superuser with username ‘super’ wants to submit job and access hdfs on behalf of a user joe. The superuser has kerberos credentials but user joe doesn’t have any. The tasks are required to run as user joe and any file accesses on namenode are required to be done as user joe. It is required that user joe can connect to the namenode or job tracker on a connection authenticated with super’s kerberos credentials. In other words super is impersonating the user joe.

Some products such as Apache Oozie need this.

Configurations

You can configure proxy user using properties hadoop.proxyuser.$superuser.hosts along with either or both of hadoop.proxyuser.$superuser.groups and hadoop.proxyuser.$superuser.users.

By specifying as below in core-site.xml, the superuser named super can connect only from host1 and host2 to impersonate a user belonging to group1 and group2.

   <property>
     <name>hadoop.proxyuser.super.hosts</name>
     <value>host1,host2</value>
   </property>
   <property>
     <name>hadoop.proxyuser.super.groups</name>
     <value>group1,group2</value>
   </property> 

If these configurations are not present, impersonation will not be allowed and connection will fail.

If more lax security is preferred, the wildcard value * may be used to allow impersonation from any host or of any user. For example, by specifying as below in core-site.xml, user named oozie accessing from any host can impersonate any user belonging to any group.

<property>
    <name>hadoop.proxyuser.oozie.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.oozie.groups</name>
    <value>*</value>
  </property>

More details in below Apache Documentation:

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Superusers.html

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here