I'm starting to get really lost in setting a secured cluster; I found it really complicated to properly configure Ranger and Knox on a kerberized cluster.
Is there someone who would be keen in helping me ? I would be very grateful.
Thanks in advance.
Hello @Laurent lau,
Would you like to tell us what problem are you facing? Without a complete problem statement, it'll be tough for anyone of us to help you out. So please edit your question and tell us your problem more clearly. Thanks.
Hello @Vipin Rathor,
I don't find any clear documentation about the entire setup of a kerberized cluster synchronized with LDAP (not AD) in order to retrieve kerberos token, and authorize the access via Ranger and Knox.
The first stage so far is to secure WebHDFS.
I don't want to generate keytabs for our hundred of users; I would like to get authenticated by LDAP, and then retrieve a Kerberos token automatically..
Any clue how to do this ?
There are couple of ways to integrate LDAP and Hadoop services (like Ambari, Ranger and Knox). But first of all, let me separate LDAP from authentication. Usually Hadoop services uses Kerberos for authentication wherein you could use MIT KDC or Microsoft Active Directory (+ Centrify) or FreeIPA to store your authentication secret and let Hadoop services use that to determine good users from the bad ones.
On the other hand, think of LDAP as repository service. LDAP store is mostly having user/group information (just like a telephone directory service). Hadoop services mostly uses LDAP service as store to sync user/group information from. Here LDAP is not particularly considered safe as source of authentication (securing the communication channel (read:SSL) is must here).
With that set, nothing stops you from using LDAP as authentication. For Ambari, you can find steps here to configure it against LDAP authentication. Similarly, for Ranger, steps could be found here to configure Ranger user authentication against LDAP.
Knox (and Zeppelin as well) supports LDAP for authentication by using Shiro configuration. Knox documentation can be found here whereas Zeppelin steps are here and my article can be found here (though it is for AD but same concept applies to LDAP authentication).
In summary, Kerberos is the ubiquitous authentication protocol accepted by Hadoop community and should be used wherever possible. LDAP should be used limited to directory service.
Hope this helps !
Thanks @Vipin Rathor for your elaborated answer.
I understand the advantages of using Kerberos, however what I found tedious is that I need to recreate user accounts (Principals) within the Kerberos Database as well as managing new passwords policies.
as I said inmy previous message, ideally I would like to configure the cluster to authenticate users against LDAP and retrieve automatically a Kerberos ticket, but I don't know if it's feasible.