In my network I have 4 machines and one Domain Controller. Let's say four machines are: mc1, mc2, mc3 and mc4 and domain controller is: dc1
I have a single node hadoop cluster deployed using Ambari on mc1. I have enabled Kerberos too on the same and setup KDC as dc1 with Active directory services.
Now, I want to access hdfs on mc1 from mc2. Can you help what setup I need to do on mc2 to achieve that?
There are two ways to access services from mc2 to your cluster on mc1. First is you have to create principals who you will have to give access to cluster. Even this may not be necessary by using service accounts and then allowing proxy users.
Now once you decide on what strategy to use, for example either create new principals and giving them access or use service principals with proxy users, then you create keytabs for principals and use those keytabs to login using kinit. If you are writing a Java program that runs from mc2, then use UserGroupInformation class in your Java program.
While enabling Kerberos on my cluster using Ambari, I gave kadmin principal as user which I created in Active Directory on dc1 with delegate control to create users. Can I use the same Active Directory user principal to create keytab using "kutils" and get authenticating user with "kinit" using this keytab? I not sure(or clear) about your second way to use service principals with proxy users.
One very Basic question:
To try any way from above, first thing I must do is to install and run kerberos client on mc2 to access mc1's hdfs. Right?
Also, i need to be integrated my kerberos client on mc2 with active directory on dc1 -> Right?
The answer to both your questions is Yes (for Active Directory integration and installing Kerberos client). You need to first understand how Kerberos works and how it integrates with Hadoop before attempting to create users and connecting to cluster with users authenticated by Keberos. The following link does a really good job explaining how to setup Kerberos and integrate with Active Directory.