Reply
Contributor
Posts: 29
Registered: ‎10-15-2015

SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

SPNEGO authentication (HTTPS) are failing with OpenJDK >= 7u80 and with OpenJDK >= 8, OpenJDK 7u79 works fine (tested on Debian 7/wheezy, CDH 5.4.7 and CDH 5.5.0).

 

Note, HDFS High Availability depends on SPNEGO, so whole cluster with security and HA may be disabled in that case.

 

It is not so easy to catch though:

  • we have some realms where SPNEGO is working even with java 8 (different ciphers, different speed/times?)
  • sometime it is enough to have cookie already from other HDFS service (with older java)

I guess it is only about merging this fix: HADOOP-10786

 

This fix looks promising, because they mention exact same version of java causing troubles.

 

Any change to merge this fix soon into Cloudera?

Posts: 1,640
Kudos: 314
Solutions: 254
Registered: ‎07-31-2013

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

[ Edited ]

The mentioned JIRA has been available in CDH since CDH 5.3.0 onwards. So your issue would appear to be something else.

 

Could you post your full error stack trace? Also, please note that CDH is only tested with Oracle JDK and not OpenJDK (although they may be similar, we do not directly test it so subtle bugs would not be documented).

 

Per Oracle's notes though, you aren't supposed to use 7u80 unless you are hitting a very specific bug

 

"""

What is the difference between a Java CPU (7u79) and PSU (7u80) release?
Java SE Critical Patch Updates (CPU) contain fixes to security vulnerabilities and critical bug fixes. Oracle strongly recommends that all Java SE users upgrade to the latest CPU releases as they are made available. Most user should choose this release.
Java SE Patch Set Updates (PSU) contain all of the security fixes in the CPUs released up to that version, as well as additional non-critical fixes. Java PSU releases should only be used if you are being impacted by one of the additional bugs fixed in that version. 

""" - http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html (and more at http://www.oracle.com/technetwork/java/javase/downloads/cpu-psu-explained-2331472.html)

 

What exact version of OpenJDK 8 fails, though? We recommend using (Oracle JDK) 8u60, or anything higher than 8u31 (but not 8u40 specifically), see http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_ig_req_supporte...

Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

The full stacktrace (using Oracle Java 8u66, from http://ppa.launchpad.net/webupd8team/java/ubuntu):

 

2015-12-07 08:51:37,934 WARN org.apache.hadoop.security.authentication.server.AuthenticationFilter: AuthenticationToken ignore
d: org.apache.hadoop.security.authentication.util.SignerException: Invalid signature
2015-12-07 08:51:38,068 WARN org.apache.hadoop.security.authentication.server.AuthenticationFilter: Authentication exception: 
GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
        at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
        at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:517)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1279)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
        at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
        at org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1279)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
        at sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:87)
        at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127)
        at sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
        at sun.security.jgss.spnego.SpNegoMechFactory.getCredentialElement(SpNegoMechFactory.java:142)
        at sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
        at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
        at sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:77)
        at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:160)
        at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:356)
        at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:348)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:348)
        ... 41 more

Good to know about the Java versions. We've just used default java in Debian OS, and probably get lucky during initial installation. :-)

 

The stacktrace is with Oracle Java 8u66, from http://ppa.launchpad.net/webupd8team/java/ubuntu.

Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

Tried also using Oracle Java 8u60. Stacktrace is the same, only without warning about signature:

 

2015-12-07 14:18:19,577 WARN org.apache.hadoop.security.authentication.server.AuthenticationFilter: Authentication exception: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
	at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
	at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:517)
...
Posts: 1,640
Kudos: 314
Solutions: 254
Registered: ‎07-31-2013

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

I am unable to reproduce this yet. Are you seeing the failure in context of a client program such as WebHDFS REST APIs, or is checkpointing between your NameNodes also failing?

If the former, are you using a Windows platform, and/or AD for your KDC?
Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

The problem is already with the initial communication of Name Node with the Journal Nodes (which are using HTTPS), and Name Nodes fail to start:

 

2015-12-12 19:47:17,603 ERROR org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: caught exception initializing https://took10.ics.muni.cz:8481/getJournal?jid=took&segmentTxId=582975&storageInfo=-60%3A524437872%3A0%3Atook
java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, status: 403, message: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:473)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:465)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:445)
        at org.apache.hadoop.security.SecurityUtil.doAsCurrentUser(SecurityUtil.java:439)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.getInputStream(EditLogFileInputStream.java:464)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.init(EditLogFileInputStream.java:141)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:192)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:250)
        at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
        at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
        at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:178)
        at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
        at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
        at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:178)
        at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:186)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:139)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:684)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:589)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:646)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:818)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:797)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1561)
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, status: 403, message: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails)
        at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:275)
        at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77)
        at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
        at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216)
        at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:164)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:470)
        ... 30 more

So the checkpointing fails (if I understand the term correctly).

 

Another way to test it is to use the JDK 7u79 in the NNs (=having fully working quorum and HDFS) and experiment with java versions only on some data nodes - test https on 50475 there.

 

The platform is Linux (Debian 7/wheezy) and KDC is Kerberos.

 

The interesting thing is I have two testing Hadoop clusters with different Kerberos KDCs and the first one is working fine and the second one is problematic.

 

The http keytab for the working one:

/var/lib/hadoop-hdfs/hadoop.keytab:

Vno  Type                     Principal                    Aliases
  2  des3-cbc-sha1            host/myriad15.zcu.cz@ZCU.CZ  
  2  aes256-cts-hmac-sha1-96  host/myriad15.zcu.cz@ZCU.CZ  
  2  des-cbc-md5              host/myriad15.zcu.cz@ZCU.CZ  
  2  des3-cbc-sha1            HTTP/myriad15.zcu.cz@ZCU.CZ  
  2  aes256-cts-hmac-sha1-96  HTTP/myriad15.zcu.cz@ZCU.CZ  
  2  des-cbc-md5              HTTP/myriad15.zcu.cz@ZCU.CZ  

The http keytab for the problematic one:

Vno  Type                     Principal                            Aliases
  1  aes256-cts-hmac-sha1-96  HTTP/took44.ics.muni.cz@ICS.MUNI.CZ  
  1  des3-cbc-sha1            HTTP/took44.ics.muni.cz@ICS.MUNI.CZ  
  1  arcfour-hmac-md5         HTTP/took44.ics.muni.cz@ICS.MUNI.CZ  
  1  aes256-cts-hmac-sha1-96  host/took44.ics.muni.cz@ICS.MUNI.CZ  
  1  des3-cbc-sha1            host/took44.ics.muni.cz@ICS.MUNI.CZ  
  1  arcfour-hmac-md5         host/took44.ics.muni.cz@ICS.MUNI.CZ  

 

Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786?)

OK, I think I know how to reproduce it:

  • default realm in /etc/krb5.conf different from the realm of the principals (in that problematic case: ICS.MUNI.CZ is realm of the machine and service principals, META is realm for the users in /etc/krb5.conf)
  • JDK > 7u79, JDK >= 8
  • once you have the authentization cookie (from node with different java version or krb5.conf), SPNEGO works even in that case on other nodes too
Posts: 1,640
Kudos: 314
Solutions: 254
Registered: ‎07-31-2013

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786

Thanks for following up! Do you have a valid [domain_realm] also defined
that maps your cluster hostnames / parent domain to its realm? I recall
SPNEGO did depend on it when I tested out some cross-realm activity in the
past.
Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786

Yes, it's there:

 

[domain_realm]
        ...
        .ics.muni.cz = ICS.MUNI.CZ
        .zcu.cz = ZCU.CZ
        ...

 

I'm using also auth_to_local property because:

  1. DEFAULT rule would map only the default realm ("META"), and "ICS.MUNI.CZ" is needed too
  2. I have slight different service principal names anyway

Non-https RPC calls work fine.

 

For typical service principal names it would be slighty different of course (mapping 'hdfs', 'yarn' users too, ...):

 <property>
    <name>hadoop.security.auth_to_local</name>
    <value>
RULE:[2:$1;$2@$0](^jhs;.*@ICS.MUNI.CZ$)s/^.*$/mapred/
RULE:[2:$1;$2@$0](^[ndjs]n;.*@ICS.MUNI.CZ$)s/^.*$/hdfs/
RULE:[2:$1;$2@$0](^nfs;.*@ICS.MUNI.CZ$)s/^.*$/nfs/
RULE:[2:$1;$2@$0](^[rn]m;.*@ICS.MUNI.CZ$)s/^.*$/yarn/
RULE:[2:$1;$2@$0](^hbase;.*@ICS.MUNI.CZ$)s/^.*$/hbase/
RULE:[2:$1;$2@$0](^hive;.*@ICS.MUNI.CZ$)s/^.*$/hive/
RULE:[2:$1;$2@$0](^hue;.*@ICS.MUNI.CZ$)s/^.*$/hue/
RULE:[2:$1;$2@$0](^spark;.*@ICS.MUNI.CZ$)s/^.*$/spark/
RULE:[2:$1;$2@$0](^tomcat;.*@ICS.MUNI.CZ$)s/^.*$/tomcat/
RULE:[2:$1;$2@$0](^zookeeper;.*@ICS.MUNI.CZ$)s/^.*$/zookeeper/
RULE:[2:$1;$2@$0](^HTTP;.*@ICS.MUNI.CZ$)s/^.*$/HTTP/
DEFAULT
</value>

 

It looks like the changes in the Java or HADOOP-10786 uncovered some other bug due to the cross-realms or the principal mapping...?

Contributor
Posts: 29
Registered: ‎10-15-2015

Re: SPNEGO authentication failure with openjdk >= 7u80 (HADOOP-10786

Btw. this fix looks interresting (it is against Hadoop 2.7.1):

HADOOP-12617

 

Announcements