Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Windows authentication (Active Directory) to Hive with ODBC

avatar
Expert Contributor

I am wanting to be able to automatically authenticate a Windows users that is in Active Directory through an app that is using an ODBC connection. So for example with MSSQL Server when I login to Windows with my AD account and I pull up SQL Server Developer Studio I get an option to use "Windows Authentication" (e.g. http://i.stack.imgur.com/Zl876.png). I would like to do this same exact thing through my application, except to Hive/HDFS/etc. I found this article https://github.com/abajwa-hw/security-workshops/blob/master/Setup-knox-23.md where I could use Knox/Ranger to authenticate to the AD but it still requires the user to put in their login information. Is it possible to do what I am asking? Or is it only possible to require the user to put in their login information to the AD again?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Kevin Vasko

I believe you are asking for SSO, Single Sign On.

SSO and Knox integration works. http://hortonworks.com/blog/hadoop-security-today-and-tomorrow/

Perimeter level Security With Apache Knox Apache Hadoop has Kerberos for authentication. However, some organizations require integration with their enterprise identity management and Single Sign-On (SSO) solutions. Hortonworks created Apache Knox Gateway (Apache Knox) to provide Hadoop cluster security at the perimeter for REST/HTTP requests and to enable the integration of enterprise identity-management solutions. Apache Knox provides integration with corporate identity systems such as LDAP, Active Directory (AD) and will also integrate with SAML based SSO and other SSO systems.

Apache Knox also protects a Hadoop cluster by hiding its network topology to eliminate the leak of network internals. A network firewall may be configured to deny all direct access to a Hadoop cluster and accept only the connections coming from the Apache Knox Gateway over HTTP. These measures dramatically reduce the attack vector.

Finally, Apache Knox promotes the use of REST/HTTP for Hadoop access. REST is proven, scalable, and provides client interoperability across languages, operating systems, and computing devices. By using Hadoop REST/HTTP APIs through Knox, clients do not need a local Hadoop installation.

View solution in original post

10 REPLIES 10

avatar
Master Mentor
@Kevin Vasko

I believe you are asking for SSO, Single Sign On.

SSO and Knox integration works. http://hortonworks.com/blog/hadoop-security-today-and-tomorrow/

Perimeter level Security With Apache Knox Apache Hadoop has Kerberos for authentication. However, some organizations require integration with their enterprise identity management and Single Sign-On (SSO) solutions. Hortonworks created Apache Knox Gateway (Apache Knox) to provide Hadoop cluster security at the perimeter for REST/HTTP requests and to enable the integration of enterprise identity-management solutions. Apache Knox provides integration with corporate identity systems such as LDAP, Active Directory (AD) and will also integrate with SAML based SSO and other SSO systems.

Apache Knox also protects a Hadoop cluster by hiding its network topology to eliminate the leak of network internals. A network firewall may be configured to deny all direct access to a Hadoop cluster and accept only the connections coming from the Apache Knox Gateway over HTTP. These measures dramatically reduce the attack vector.

Finally, Apache Knox promotes the use of REST/HTTP for Hadoop access. REST is proven, scalable, and provides client interoperability across languages, operating systems, and computing devices. By using Hadoop REST/HTTP APIs through Knox, clients do not need a local Hadoop installation.

avatar
Expert Contributor

Single Sign On sounds like what I am wanting. However, I am having trouble finding any documentation on setting Single Sign On up. Could you possibly point me towards some documentation to start reading?

avatar
Master Mentor

@Kevin Vasko There are several vendors who provides SSO. "paid"

Open Source SSO

avatar
Expert Contributor

Thanks! I was actually looking for instructions on how to configure Knox to work with Active Directory Federation Services. This is the best information that I can find. https://knox.apache.org/books/knox-0-7-0/user-guide.html#Quick+Start but still doesn't answer the question on what I need to change on the application side (ODBC) to get it to work without the user having to login (other than using their windows login). Or am I missing/missunderstanding something?

avatar
Expert Contributor

So I have been doing some more research for my understanding and I was clearly confused on my part. I wasn't making the distinction that there is a difference between SSO and the AD. I was just thinking in my head "central login storage place". I know you linked to the wiki page with paid SSOs but in my mind I just saw "Active Directory Federation Services" and just thought "ok, AD is on the list". It didn't click that AD != ADFS. Me being stupid......

avatar
Expert Contributor

. ...Now that I understand that, I see what you mean by using http headers. Assuming the SSO we use supports SAML 2.0 (e.g. I am guessing ADFS which does), I would need to configure my ODBC DSN as "no authentication" and then use the thrift transport http section to specify the http path and "header: SM_USER: XYZ" information? Does that sound about correct? I these instructions on ODBC configuration but the ODBC DSN interface looks old in those screenshots.

avatar
Rising Star

I would generally say that if you're talking SSO for ODBC on Windows, the easiest option client side is Kerberos. As long as the user is logged into his workstation as an AD user that has rights on the Hive tables, the Kerberos ticket the OS gets at login will be sufficient for authentication to Hive over the SASL transport. You just need to make sure you've kerberized your cluster using one of your Active Directory Domain Controllers as the KDC (http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/ch_configur...). At that point just configure the ODBC driver for kerberos.

1541-snip-20160125095709.png

Again, the only client side requirement is that the machine is joined to the domain and the user logs in with an authorized account.

avatar
Expert Contributor

Thanks! That looks and appears to be a lot easier than dealing with Knox and Ranger initially. I still need to do a lot of reading on Ranger/Knox/Kerberos and how they all interact. I haven't messed with any of them to much of an extent on the sandbox.