Created on 05-05-2018 11:02 AM - edited 09-16-2022 06:11 AM
Please help in understanding how impersonation matters in non-secured i.e. Non-SSL and Non-Kerberos environment/Cluster.
Created on 05-07-2018 06:15 PM - edited 08-18-2019 01:33 AM
@Ankita Shukla While SSL and Kerberos help address other aspects of security such as wire encryption and authentication, impersonation help in resolving a different problematic. Impersonation means performing actions on behalf of the requested user. Certain services such as Knox/Livy or Hive (when doAs=true) require to impersonate end users when performing access to resources like Yarn and HDFS. Only valid users are allowed to impersonate other users. Impersonation in hadoop is setup using hadoop.proxyuser.* configuration on core-site.xml - And only listed users in core-site.xml will be allowed to impersonate certain hosts and groups.
A common example for impersonation is Hive, when configured to run as end user instead of Hive user ( hive.server2.enable.doAs=true ) - Knox gateway and Livy are also other good examples. And there are other examples as well.
Important aspects when using impersonation are:
1) All access to underlying resources (like HDFS) will be made as end user instead of user hive. This helps when you like to perform all authorization checks on hdfs posix level.
2) Applications launched on yarn (if any) will be launched as end user instead of hive/knox/livy user. This way you can make use of capacity scheduler to map users to certain queues with different resource limitations.
If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
HTH
Created on 05-07-2018 06:15 PM - edited 08-18-2019 01:33 AM
@Ankita Shukla While SSL and Kerberos help address other aspects of security such as wire encryption and authentication, impersonation help in resolving a different problematic. Impersonation means performing actions on behalf of the requested user. Certain services such as Knox/Livy or Hive (when doAs=true) require to impersonate end users when performing access to resources like Yarn and HDFS. Only valid users are allowed to impersonate other users. Impersonation in hadoop is setup using hadoop.proxyuser.* configuration on core-site.xml - And only listed users in core-site.xml will be allowed to impersonate certain hosts and groups.
A common example for impersonation is Hive, when configured to run as end user instead of Hive user ( hive.server2.enable.doAs=true ) - Knox gateway and Livy are also other good examples. And there are other examples as well.
Important aspects when using impersonation are:
1) All access to underlying resources (like HDFS) will be made as end user instead of user hive. This helps when you like to perform all authorization checks on hdfs posix level.
2) Applications launched on yarn (if any) will be launched as end user instead of hive/knox/livy user. This way you can make use of capacity scheduler to map users to certain queues with different resource limitations.
If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
HTH