- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Secured WebHDFS HA with HAProxy
- Labels:
-
Apache Hadoop
Created ‎03-22-2017 10:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Has anyone made HAProxy work with Kerberos-ed WebHDFS for HA?
I've been trying to but couldn't make it work. Now I'm testing with the simplest haproxy.cfg like below
... frontend main *:50070 default_backend app backend app server node2 node2.localdomain:50070 check
Also spnego.servie.keytab on NamaNode is:
[root@node2 keytabs]# klist -k spnego.service.keytab Keytab name: FILE:spnego.service.keytab KVNO Principal ---- -------------------------------------------------------------------------- 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02
And getting "HTTP/1.1 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)"
Or which tool/software would you use for WebHDFS with Kerberos for HA if no Knox and no hadoop-httpfs?
Created ‎03-23-2017 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Hajime San,
Since you have enabled Kerberos & SPNego for both NameNode and then when you make a request to HAProxy URL, curl generates a Kerberos authenticator. This contains principal name (your user), Kerberos service principal name (HTTP/<ha-proxy-node> OR HTTP/node1.localdomain) among other session details. When this authenticator reaches the NameNode (node2.localdomain), it checks that the received authenticator is intended for a service running with 'HTTP/node1.localdomain' service principal. Since the name of NameNode service principal name (HTTP/node1.loc doesn't match with HTTP/node2.localdomain, the error like 'checksum failed' is returned.
To fix this name mismatching, you need to specify "dfs.web.authentication.kerberos.principal=*" in HDFS configuration in Ambari, so that NameNode can allow other principal name as well.
Hope this helps !
Created ‎03-22-2017 11:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Hajime - Please see below Apache Documentation for WebHdfs Authentication:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Authentication
As stated in point 2 for Kerberos enabled cluster you can do kinit with Spengo Principal and perform the webhdfs operation.
Can you please paste the command you are trying and the stack trace to debug further?
Created ‎03-23-2017 12:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[root@node1 ~]# curl -i --negotiate -u : 'http://node1.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS' HTTP/1.1 401 Authentication required Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1408 Server: Jetty(6.1.26.hwx) HTTP/1.1 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed) Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1532 Server: Jetty(6.1.26.hwx) <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/> <title>Error 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)</title> </head> <body><h2>HTTP ERROR 403</h2> <p>Problem accessing /webhdfs/v1/tmp/. Reason: <pre> GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)</pre></p><hr /><i><small>Powered by Jetty://</small></i><br/> <br/> <br/> ...
I'm getting this.
Also if I use a delegation token, it works, but normal user wouldn't not know how to get own delegation token... 😞
Created ‎03-23-2017 12:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please share your hdfs-site.xml file.
Created ‎03-23-2017 12:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sure! Thank you for taking a look at this issue.
Created ‎03-23-2017 12:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please paste the output of running below commands :
kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/node2.localdomain@HO-UBU02 klist curl -i --negotiate -u : 'http://node2.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS'
Created ‎03-23-2017 01:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Namit Maheshwari Node2 is Active NameNode right now (node1 is HAProxy server)
I changed to curl -I, if you prefer curl -i, let me know. Thank you!
[root@node2 ~]# kdestroy kdestroy: No credentials cache found while destroying cache [root@node2 ~]# kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/node2.localdomain@HO-UBU02 [root@node2 ~]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: HTTP/node2.localdomain@HO-UBU02 Valid starting Expires Service principal 03/23/17 00:54:33 03/23/17 10:54:33 krbtgt/HO-UBU02@HO-UBU02 renew until 03/30/17 00:54:33 [root@node2 ~]# curl -I --negotiate -u : 'http://node2.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS' HTTP/1.1 401 Authentication required Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1408 Server: Jetty(6.1.26.hwx) HTTP/1.1 200 OK Cache-Control: no-cache Expires: Thu, 23 Mar 2017 00:55:47 GMT Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Expires: Thu, 23 Mar 2017 00:55:47 GMT Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Content-Type: application/json Set-Cookie: hadoop.auth="u=HTTP&p=HTTP/node2.localdomain@HO-UBU02&t=kerberos&e=1490266548000&s=HN3jepaKuYI5iKYfJ5IW1wHxJ3M="; Path=/; HttpOnly Content-Length: 0 Server: Jetty(6.1.26.hwx) [root@node2 ~]#
Created ‎03-23-2017 01:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So, this works fine as expected when we provide the Active Namenode in the curl call instead of the Proxy server.
Yeah, I don't think we can work around this by using a proxy. There is an Apache Jira already open for the issue:
https://issues.apache.org/jira/browse/HDFS-6371
For now I think you can either use the way we did above, or have Knox / Httpfs
Created ‎03-23-2017 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Hajime San,
Since you have enabled Kerberos & SPNego for both NameNode and then when you make a request to HAProxy URL, curl generates a Kerberos authenticator. This contains principal name (your user), Kerberos service principal name (HTTP/<ha-proxy-node> OR HTTP/node1.localdomain) among other session details. When this authenticator reaches the NameNode (node2.localdomain), it checks that the received authenticator is intended for a service running with 'HTTP/node1.localdomain' service principal. Since the name of NameNode service principal name (HTTP/node1.loc doesn't match with HTTP/node2.localdomain, the error like 'checksum failed' is returned.
To fix this name mismatching, you need to specify "dfs.web.authentication.kerberos.principal=*" in HDFS configuration in Ambari, so that NameNode can allow other principal name as well.
Hope this helps !
Created ‎04-01-2017 01:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to @Vipin Rathor, I was able to setup HAProxy for Kerberos-ed WebHDFS.
After that needed some changes for Ambari, so wrote: https://community.hortonworks.com/articles/91685/how-to-setup-haproxy-for-webhdfs-ha.html (日本語)
