Created 08-03-2016 02:56 PM
Hi,
I am working on a distcp solution between two clusters. On cluster01 HDFS, there are multiple directories and each is owned by a different application team. The requirement is to distcp these directories onto cluster02 by preserving the access privileges. Both the clusters are secured.
I was thinking of having a service user something like "distcp-user" with its own kerberos principal who can manage the distcp process and auditing would be easy as well.
Thanks
Vijay
Created 08-03-2016 04:24 PM
@Vijaya Narayana Reddy Bhoomi Reddy
You can leverage kerberos impersonations and maintain your read/write policy for the user you plan on impersonating through ranger. Setup user on ranger to read from cluster one. and cluster2 have ranger policy to able user to write. Have you looked into apache falcon? might be easier to setup the replication
confirm hadoop.security.authorization is set to true
To enable kerberos impersonations, core-site.xml
<property>
<name>hadoop.proxyuser.yourapp.groups</name>
<value>ImpersonationGrp1,ImpersonationGrp2</value>
</property>
<property>
<name>hadoop.proxyuser.yourapp.hosts</name>
<value>host</value>
</property>
Update yourapp with your service princple name. UPdate ImpersonationGrp1 and ImpersonationGrp2 with groups your user is allowed to impersonate. Finally update host with your app server
Created 08-03-2016 04:24 PM
@Vijaya Narayana Reddy Bhoomi Reddy
You can leverage kerberos impersonations and maintain your read/write policy for the user you plan on impersonating through ranger. Setup user on ranger to read from cluster one. and cluster2 have ranger policy to able user to write. Have you looked into apache falcon? might be easier to setup the replication
confirm hadoop.security.authorization is set to true
To enable kerberos impersonations, core-site.xml
<property>
<name>hadoop.proxyuser.yourapp.groups</name>
<value>ImpersonationGrp1,ImpersonationGrp2</value>
</property>
<property>
<name>hadoop.proxyuser.yourapp.hosts</name>
<value>host</value>
</property>
Update yourapp with your service princple name. UPdate ImpersonationGrp1 and ImpersonationGrp2 with groups your user is allowed to impersonate. Finally update host with your app server
Created 08-03-2016 04:35 PM
Thanks @Sunile Manjee This is the approach I think I need to follow. Just was trying to understand if there is any other alternative. To answer your question around Falcon, we are not using because we are on HDP2.4.2 and need to leverage HDFS snapshots. Falcon doesn't yet support snapshots till 2.5. So going with this approach for now.