Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Integrating KERBEROS with AD and LDAP

avatar
Rising Star

Team,

Need your help on understanding AD / LDAP / Kerbores Integration on Hadoop. Please help me to understand

1) what is the use of Having Ldap between AD, Hadoop and kerberos integration ?

2) What is the advantage and Disadvantage on Integrating AD and hadoop and kerbores without Ldap.?

3) what is difference between implementing MIT KDC and Direct AD setup

Can you please provide me the doc where i can understand the integration of Hadoop Cluster into a Active Directory and Kerbores

1 ACCEPTED SOLUTION

avatar

First let's clarify the difference between LDAP and AD.

LDAP is an application protocol for querying and modifying items in directory service providers (e.g Active Directory). AD is a directory services provider that supports the LDAP protocol amongst others.

https://jumpcloud.com/blog/difference-between-ldap-and-active-directory/

 

1) what is the use of Having Ldap between AD, Hadoop and kerberos integration ?

You wouldn't actually have an LDAP provider, you would just use the LDAP protocol to talk to AD

 

2) What is the advantage and Disadvantage on Integrating AD and hadoop and kerbores without LDAP?

See answer above. You only use the LDAP protocol, not an LDAP directory service provider to connect to AD

 

3) what is difference between implementing MIT KDC and Direct AD setup?

You can go with either

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/_installing_and_c...

A very general rule-of thumb I follow is to use AD KDC if a cluster size is less than 100. If the cluster is greater than 100 nodes, then a local LDAP/KDC might be a better option. This is because load on AD from 100’s of service accounts can cause performance and stability issues in AD. It’s not so much KDC, it is a combination of AD lookup/ searches and the KDC being on AD that would be the challenge.

 

Can you please provide me the doc where i can understand the integration of Hadoop Cluster into a Active Directory and Kerbores?

Take a look at these links for instructions on how to enable Kerberos on HDP and integrate with AD: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/_configuring_amba...

http://hortonworks.com/blog/enabling-kerberos-hdp-active-directory-integration/

View solution in original post

6 REPLIES 6

avatar
Master Mentor

@suresh krish

When your hadoop cluster is being accessed by 1000's of users its best to use SSO hence AD/LDAP. For easy management of user credentials and maybe corporate security settings

When you logon a node in an Hadoop cluster it basically gives you access to all the resources because say you logged on as TOM evenif someone had stolen your credentials it will believe you are indeed TOM and so will YARN and other components which in modern IT infrastruture is very dangerous with all the hacking ,DOS attacks etc. In a Kerberized environment Hadoop wont believe you are TOM it will ask you for a ticket analogy of a Passport at an Airport and to make sure the passport is not forged like the Migrations do it will check your ticket (passport) against its database to ascertain it was not stolen !!! ONLY after validating that you are really TOM then it will allow you to run queries or jobs on that cluster.

That's quiet assuring isn't it. for documentation there should be some in this forum. If not I will need to mask some data if I am to provide you my production integration documentation.

Happy Hadooping

avatar

First let's clarify the difference between LDAP and AD.

LDAP is an application protocol for querying and modifying items in directory service providers (e.g Active Directory). AD is a directory services provider that supports the LDAP protocol amongst others.

https://jumpcloud.com/blog/difference-between-ldap-and-active-directory/

 

1) what is the use of Having Ldap between AD, Hadoop and kerberos integration ?

You wouldn't actually have an LDAP provider, you would just use the LDAP protocol to talk to AD

 

2) What is the advantage and Disadvantage on Integrating AD and hadoop and kerbores without LDAP?

See answer above. You only use the LDAP protocol, not an LDAP directory service provider to connect to AD

 

3) what is difference between implementing MIT KDC and Direct AD setup?

You can go with either

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/_installing_and_c...

A very general rule-of thumb I follow is to use AD KDC if a cluster size is less than 100. If the cluster is greater than 100 nodes, then a local LDAP/KDC might be a better option. This is because load on AD from 100’s of service accounts can cause performance and stability issues in AD. It’s not so much KDC, it is a combination of AD lookup/ searches and the KDC being on AD that would be the challenge.

 

Can you please provide me the doc where i can understand the integration of Hadoop Cluster into a Active Directory and Kerbores?

Take a look at these links for instructions on how to enable Kerberos on HDP and integrate with AD: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/_configuring_amba...

http://hortonworks.com/blog/enabling-kerberos-hdp-active-directory-integration/

avatar
Rising Star

Hi Eyad Garelnabi,

Thanks for your time. I have one quick question. As you said if it is more than 100 nodes then local LDAP/KDC will be better. In that case user will be created on local machine ? i.e users will be created on linux machine and it will handled by LDAP ? Could you please correct me if iam wrong

avatar

Yes. Rather than recreating users from scratch though, you can synchronize your local LDAP with you corporate AD.

Having said that, especially when it comes to security, you'll be governed by your organization's policies regarding what you can and can't do more-so than the technical aspects.

avatar
New Contributor

I integrated the cluster with AD. The AD group sync up to cluster or ranger is not working. How do I setup LDAP protocol?

Can I get some guidance on sync up of user/groups from AD?

avatar
Contributor

@Geoffrey Shelton Okot

Can i use open ldap instead of AD , i mean create users and groups in openldap and use it as backend for Kerberos??

Is it good practice?