Created 03-22-2017 10:30 PM
Has anyone made HAProxy work with Kerberos-ed WebHDFS for HA?
I've been trying to but couldn't make it work. Now I'm testing with the simplest haproxy.cfg like below
... frontend main *:50070 default_backend app backend app server node2 node2.localdomain:50070 check
Also spnego.servie.keytab on NamaNode is:
[root@node2 keytabs]# klist -k spnego.service.keytab Keytab name: FILE:spnego.service.keytab KVNO Principal ---- -------------------------------------------------------------------------- 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node1.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02 1 HTTP/node2.localdomain@HO-UBU02
And getting "HTTP/1.1 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)"
Or which tool/software would you use for WebHDFS with Kerberos for HA if no Knox and no hadoop-httpfs?
Created 03-23-2017 07:25 AM
Hello @Hajime San,
Since you have enabled Kerberos & SPNego for both NameNode and then when you make a request to HAProxy URL, curl generates a Kerberos authenticator. This contains principal name (your user), Kerberos service principal name (HTTP/<ha-proxy-node> OR HTTP/node1.localdomain) among other session details. When this authenticator reaches the NameNode (node2.localdomain), it checks that the received authenticator is intended for a service running with 'HTTP/node1.localdomain' service principal. Since the name of NameNode service principal name (HTTP/node1.loc doesn't match with HTTP/node2.localdomain, the error like 'checksum failed' is returned.
To fix this name mismatching, you need to specify "dfs.web.authentication.kerberos.principal=*" in HDFS configuration in Ambari, so that NameNode can allow other principal name as well.
Hope this helps !
Created 03-22-2017 11:36 PM
@Hajime - Please see below Apache Documentation for WebHdfs Authentication:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Authentication
As stated in point 2 for Kerberos enabled cluster you can do kinit with Spengo Principal and perform the webhdfs operation.
Can you please paste the command you are trying and the stack trace to debug further?
Created 03-23-2017 12:08 AM
[root@node1 ~]# curl -i --negotiate -u : 'http://node1.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS' HTTP/1.1 401 Authentication required Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1408 Server: Jetty(6.1.26.hwx) HTTP/1.1 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed) Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:05:33 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1532 Server: Jetty(6.1.26.hwx) <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/> <title>Error 403 GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)</title> </head> <body><h2>HTTP ERROR 403</h2> <p>Problem accessing /webhdfs/v1/tmp/. Reason: <pre> GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)</pre></p><hr /><i><small>Powered by Jetty://</small></i><br/> <br/> <br/> ...
I'm getting this.
Also if I use a delegation token, it works, but normal user wouldn't not know how to get own delegation token... 😞
Created 03-23-2017 12:25 AM
Can you please share your hdfs-site.xml file.
Created 03-23-2017 12:29 AM
Sure! Thank you for taking a look at this issue.
Created 03-23-2017 12:45 AM
Can you please paste the output of running below commands :
kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/node2.localdomain@HO-UBU02 klist curl -i --negotiate -u : 'http://node2.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS'
Created 03-23-2017 01:00 AM
@Namit Maheshwari Node2 is Active NameNode right now (node1 is HAProxy server)
I changed to curl -I, if you prefer curl -i, let me know. Thank you!
[root@node2 ~]# kdestroy kdestroy: No credentials cache found while destroying cache [root@node2 ~]# kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/node2.localdomain@HO-UBU02 [root@node2 ~]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: HTTP/node2.localdomain@HO-UBU02 Valid starting Expires Service principal 03/23/17 00:54:33 03/23/17 10:54:33 krbtgt/HO-UBU02@HO-UBU02 renew until 03/30/17 00:54:33 [root@node2 ~]# curl -I --negotiate -u : 'http://node2.localdomain:50070/webhdfs/v1/tmp/?op=LISTSTATUS' HTTP/1.1 401 Authentication required Cache-Control: must-revalidate,no-cache,no-store Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Content-Type: text/html; charset=iso-8859-1 WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Length: 1408 Server: Jetty(6.1.26.hwx) HTTP/1.1 200 OK Cache-Control: no-cache Expires: Thu, 23 Mar 2017 00:55:47 GMT Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Expires: Thu, 23 Mar 2017 00:55:47 GMT Date: Thu, 23 Mar 2017 00:55:47 GMT Pragma: no-cache Content-Type: application/json Set-Cookie: hadoop.auth="u=HTTP&p=HTTP/node2.localdomain@HO-UBU02&t=kerberos&e=1490266548000&s=HN3jepaKuYI5iKYfJ5IW1wHxJ3M="; Path=/; HttpOnly Content-Length: 0 Server: Jetty(6.1.26.hwx) [root@node2 ~]#
Created 03-23-2017 01:13 AM
So, this works fine as expected when we provide the Active Namenode in the curl call instead of the Proxy server.
Yeah, I don't think we can work around this by using a proxy. There is an Apache Jira already open for the issue:
https://issues.apache.org/jira/browse/HDFS-6371
For now I think you can either use the way we did above, or have Knox / Httpfs
Created 03-23-2017 07:25 AM
Hello @Hajime San,
Since you have enabled Kerberos & SPNego for both NameNode and then when you make a request to HAProxy URL, curl generates a Kerberos authenticator. This contains principal name (your user), Kerberos service principal name (HTTP/<ha-proxy-node> OR HTTP/node1.localdomain) among other session details. When this authenticator reaches the NameNode (node2.localdomain), it checks that the received authenticator is intended for a service running with 'HTTP/node1.localdomain' service principal. Since the name of NameNode service principal name (HTTP/node1.loc doesn't match with HTTP/node2.localdomain, the error like 'checksum failed' is returned.
To fix this name mismatching, you need to specify "dfs.web.authentication.kerberos.principal=*" in HDFS configuration in Ambari, so that NameNode can allow other principal name as well.
Hope this helps !
Created 04-01-2017 01:15 AM
Thanks to @Vipin Rathor, I was able to setup HAProxy for Kerberos-ed WebHDFS.
After that needed some changes for Ambari, so wrote: https://community.hortonworks.com/articles/91685/how-to-setup-haproxy-for-webhdfs-ha.html (日本語)