Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar

Alfredo Sauce - Hadoop HTTP, Kerberos and SPNEGO

Kerberos SPNEGO authentication for HTTP has been part of hadoop for some time now. On secure cluster many services use it to authenticate HTTP APIs and WEB UIs.

Setup and configuration can become a challenge as it involves many aspects, including: kerberos principals, keytabs, network and load balancers, remote users accessing via different browsers, different operative systems, etc.

In this article I will share how Kerberos SPNEGO authentication for HTTP works in hadoop.

Introduction

Kerberos SPNEGO authentication for HTTP was introduced to hadoop via HADOOP-7119. The implementation is based on a servlet filter that is configured to front all incoming HTTP requests to the application. If not valid hadoop.auth cookie is found, the servlet filter calls the KerberosAuthenticationHandler to perform kerberos authentication for the UserAgent request. Upon a successful kerberos authentication, servlet filter adds a signed cookie to the response so that following requests, as long as cookie is valid, are only authenticated via cookie and not via kerberos api.

Configuration

As far as configuration goes most hadoop services support the following properties with similar names:

 

authentication.kerberos.keytab=/etc/security/keytabs/spnego.service.keytabPoints to the location of the spnego keytab file
authentication.kerberos.principal=HTTP/_HOST@REALM.COMContains the principal name
authentication.kerberos.name.rules auth_to_local rulesContains the auth to local rules


Implementation details

Kerberos SPNEGO authentication often requires more than one interaction until authentication is successful and a valid cookie is issued. Here is the sequence diagram for a successful authentication.

image.pngNote: Sequence diagram HadoopAuthenticationFilter is actually an interface implemented by many hadoop services. For simplicity, instead of using a classname specific to any hadoop service I have kept the interface name.

Diagram steps:

  • Step (1): The first interaction any UserAgent makes doesn’t contain a valid hadoop.auth cookie. And it also doesn't contain the HTTP Authorization: Negotiate header, which is required to perform kerberos authentication. Hence the KerberosAuthenticationHandler quickly responds back with HTTP/1.1 401 Unauthorized.
  • Step (2): Second interaction UserAgent makes contains the HTTP Authorization: Negotiate header. This header value is base64 encoded and contains the client kerberos token. At this point KerberosAuthenticationHandler performs the following two key steps:
  • Step (3): Finding the right service principal to use is key to authenticate using kerberos. Here are the steps involved in finding the right service principal:
  1. At initialization time KerberosAuthenticationHandler reads the principals from spnego keytab (authentication.kerberos.keytab). Those principals are usually in the form HTTP/_HOST@REALM.COM. Where _HOSTS matches FQDN for the server where the service is running and/or the Load Balancer FQDN.
  2. Based on the incoming request the KerberosAuthenticationHandler computes the serverName. Here is exact way in which serverName is computed:
final StringserverName = InetAddress.getByName(request.getServerName()).getCanonicalHostName();

Let’s break this down:

  1. request.getServerName(): Returns the host name of the server to which the request was sent. It is the value of the part before ":" in the Hostheader value, if any, or the resolved server name, or the server IP address (ref: ServletRequest API)
  2. InetAddress.getByName(_): Determines the IP address of a host, given the host's name (ref: InetAddress API)
  3. getCanonicalHostName(): Gets the fully qualified domain name for this IP address. Best effort method, meaning we may not be able to return the FQDN depending on the underlying system configuration (ref: InetAddress API)

It’s important DNS reverse resolution is configured appropriately so that step 1 to 3 result in a valid FQDN.

4. serverName is used to search the hashmap loaded at initialization time for the right service principal name. If service principal is found this step completes successfully. You can see the following TRACE messages in the logs:

TRACE KerberosAuthenticationHandler:422 - SPNEGO with server principals:[HTTP/serverName@REALM.COM] for serverName

If no principal is found you will see the following (notice empty bracket):

TRACE KerberosAuthenticationHandler:422 - SPNEGO with server principals:[]for serverName
  • Step (4): KerberosAuthenticationHandler authenticates using kerberos api. Upon successful authentication it creates a valid authentication token. HadoopAuthenticationFilter receives the token, creates a valid hadoop.auth cookie, and allows request to continue to the requested resource. If trace is enabled logs will show:
TRACE KerberosAuthenticationHandler:467 - SPNEGO initiated with server principal [HTTP/fqdn_of_server@REALM.COM]
TRACE KerberosAuthenticationHandler:494 - SPNEGO completed for client principal [user@REALM.COM]
  • Step (5): Following requests made by UserAgent contain a valid hadoop.auth cookie. While cookie remains valid no kerberos authentication will be issued.


Advanced Setup with Load Balancer

Here is the list of things to check when configuring LB (Load Balancer) with Kerberos SPNEGO authentication for HTTP:

  1. Sticky/Persistent sessions are required at LB configuration.
  2. New Kerberos service principal needs to be created for the LB HTTP/<YOUR_LOAD_BALANCER_FQDN>@REALM.COM and keytab added to the authentication.kerberos.keytab file on all the application service nodes.
  3. Configuration property authentication.kerberos.principal must be set to a wildcard so that the LB service principal is also loaded when KerberosAuthenticationHandler initializes.

authentication.kerberos.principal=*

4. Load Balancer's FQDN will resolve to possibly multiple different IP addresses. From service application host reverse DNS lookup for these IP addresses must resolve back to the Load Balancer FQDN.

Here is an example:

Load balancer FQDN: elb.example.com

elf.example.com is mapped to 2 different internal IP addresses -> 192.168.1.10 and 192.168.1.15

Note: ping command issued multiple times helps to find out to what IP addresses the FQDN resolves to.

PING elf.example.com (192.168.1.10) 56(84) bytes of data.
ping elf.example.com
PING elf.example.com (192.168.1.15) 56(84) bytes of data.

Reverse resolution of IP 192.168.1.10 must be elb.example.com

Reverse resolution of IP 192.168.1.15 must be elb.example.com

You can use following java code to find out exactly how the serverName is being computed starting form Host:

import java.net.InetAddress;
public class GetServerName 
{    
    public static void main(String[] args) throws Exception
    {
        if(args.length != 1)
        {
            System.out.println("ERROR: Missing argument <Host>");
            System.out.println("Use GetServerName <Host>.");
        }
        else {
            final String serverName = InetAddress.getByName(args[0]).getCanonicalHostName();
            System.out.format("Server name for %s is %s\n", args[0], serverName);
        }        
    }
}
  1. Create file named GetServerName.java with the above content
  2. Run javac GetServerName.java
  3. Run java GetServerName <Host>

Remote Users - Browser Configuration

You should try to answer the following questions when configuring remote UserAgents:

  1. Do you have a valid kerberos ticket? Client remote users and services must acquire a valid kerberos ticket. While this task could be automated, sometimes it has to be done manually. Either case you can check what ticket you have by running command klist.
  1. What is the REALM for the principal being used and what is the REALM of the service principal your trying to connect with? If realms don't match you should perform the necessary configuration to establish trust between the REALMs. Use command klist to get details on what REALM is your principal using. On service side you can also use klist -kt to list contents of keytab to find the REALM service is using.
  1. Is your browser configured to perform SPNEGO correctly? There are several articles on WWW that cover how to perform this configuration for the most popular browsers. Make sure you follow the steps for your browser.

Troubleshooting and DEBUG

Server Side

Your service log files are the place to check. To debug I recommend adding the following to your log4j

log4j.logger.org.apache.hadoop.security.authentication.server=TRACE

And for kerberos DEBUG you can also add the java argument -Dsun.security.krb5.debug=true

Client Side

I find very helpful to use curl command like this:

curl-iv --negotiate -u :-X GET 'http://URL'

With this configuration curl will display each interaction and headers involved. Here is an example:

Note: Greater than sign ( > ) indicates request from UserAgent to application. Less than sign ( < ) indicates response from application to UserAgent.

curl -iv --negotiate -u : -X GET 'http://oozielb.example.com:11000/oozie/'

GET /oozie/ HTTP/1.1
> Host: oozielb.example.com:11000
> User-Agent: curl/7.54.0
> Accept: */*

< HTTP/1.1 401 Unauthorized
< Date: Wed, 21 Feb 2018 17:29:15 GMT
< Content-Type: text/html;charset=utf-8
< Content-Length: 997
< Connection: keep-alive
< Server: Apache-Coyote/1.1
< WWW-Authenticate: Negotiate
< Set-Cookie: hadoop.auth=; Path=/; HttpOnly

> GET /oozie/ HTTP/1.1
> Host: oozielb.example.com:11000
> Authorization: Negotiate YII....................This is the client kebreros token
> User-Agent: curl/7.54.0
> Accept: */*

< HTTP/1.1 200 OK
< Date: Wed, 21 Feb 2018 17:29:15 GMT
< Content-Type: text/html
< Content-Length: 3754
< Connection: keep-alive
< Server: Apache-Coyote/1.1
< Set-Cookie: hadoop.auth="u=falbani&p=falbani@EXAMPLE.COM&t=kerberos&e=1519270155204&s=6RmPzEYJR0nsF2i7TFk4S+lNydc="; Path=/; HttpOnly
< Set-Cookie: JSESSIONID=254F8AA4060810E7545DEE95F2E6AB83; Path=/oozie
< Continuation you will see the HTML WEB PAGE content


Article Title

If you are wondering about article title used you should review jira HADOOP-7119 😉

Thanks

Special thanks to @emattos and @Vipin Rathor that helped reviewing this article.

3,010 Views