Assume you have a kerberized cluster using both LDAP and a standalone KDC. In a kerberized cluster, what are the steps that happen for knox to authentication via ldap and kerberos? My understanding is that when you login to knox, knox does an LDAP bind to an LDAP server, once authenticated, Knox would then (now behind the proxy) contact the kerberos service and get a ticket for that user. Question is,
1. Are these assumptions correct?
2. Which user does Knox get a ticket for, the knox user, or the user that is used when binding to LDAP?
You would create a knox user who will authenticate against Kerberos. This user can impersonate those connecting to it aka proxy users. Following settings are required. Also check the following link.
<property> <name>hadoop.proxyuser.knox.groups</name> <value>users</value> </property> <property> <name>hadoop.proxyuser.knox.hosts</name> <value>$knox-host</value> </property>
@Ed Yes, your first assumption is correct. LDAP/AD/SSO mechanisms are used to authenticate the User to Knox.
For 2nd point, Kerberos/SPNEGO will be used for Knox authenticating to other requested services e.g. WebHDFS, Hive, etc. Knox has a SPNEGO keytab (for knox user) for this purpose and will do identity assertion. Refer: http://knox.apache.org/books/knox-0-8-0/user-guide.html#Identity+Assertion