Created on 03-19-2015 03:01 PM - edited 09-16-2022 02:24 AM
I installed the latest Cloudera Manager Server and Agent packages (5.3.2) started everything up fine, created my cluster, installed parcels for HDFS, HBase, Hive, Hue, Impala, Oozie, Yarn, and Zookeeper. I fixed all configuration and health issues. Everything was working exactly as expected. I then began following instructions I found here: http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_sg_intro_kerb.html to get my cluster secured with Kerberos. Have my KDC fully configured and running. The wizard was working great until just after it prompted me for the admin principal (cloudera-scm/admin) and its password. After I entered those, it indicated that it had successfully authenticated that principal, and began to configure my services to be used with kerberos. At some point though (I believe as it was trying to turn the Cloudera Management Service back on) it failed, and emitted this:
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: Login failure for hue/my.namenode.com@MY.REALM.COM from keytab hue.keytab
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.cloudera.cmf.cdhclient.CdhExecutorFactory.createExecutor(CdhExecutorFactory.java:274)
at com.cloudera.cmf.cdhclient.CdhExecutorFactory.createExecutor(CdhExecutorFactory.java:309)
at com.cloudera.enterprise.AbstractCDHVersionAwarePeriodicService.<init>(AbstractCDHVersionAwarePeriodicService.java:73)
at com.cloudera.cmon.firehose.AbstractHBasePoller.<init>(AbstractHBasePoller.java:95)
at com.cloudera.cmon.firehose.HBaseFsckPoller.<init>(HBaseFsckPoller.java:53)
at com.cloudera.cmon.firehose.Firehose.createSecurityAwarePollers(Firehose.java:446)
at com.cloudera.cmon.firehose.Firehose.setupServiceMonitoringPollers(Firehose.java:436)
at com.cloudera.cmon.firehose.Firehose.<init>(Firehose.java:311)
at com.cloudera.cmon.firehose.Main.main(Main.java:527)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: Login failure for hue/my.namenode.com@MY.REALM.COM from keytab hue.keytab
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at com.cloudera.cmf.cdhclient.CdhExecutorFactory.createExecutor(CdhExecutorFactory.java:268)
... 8 more
Caused by: java.lang.RuntimeException: java.io.IOException: Login failure for hue/my.namenode.com@MY.REALM.COM from keytab hue.keytab
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.cloudera.cmf.cdhclient.CdhExecutorFactory$SecureClassLoaderSetupTask.run(CdhExecutorFactory.java:491)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Login failure for hue/my.namenode.com@MY.REALM.COM from keytab hue.keytab
at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:855)
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:279)
at com.cloudera.cmf.cdh4client.CDH4ObjectFactoryImpl.login(CDH4ObjectFactoryImpl.java:194)
at com.cloudera.cmf.cdhclient.CdhExecutorFactory$SecureClassLoaderSetupTask.run(CdhExecutorFactory.java:485)
... 5 more
Caused by: javax.security.auth.login.LoginException: Connection refused
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:767)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:784)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:846)
... 8 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.security.krb5.internal.TCPClient.<init>(NetClient.java:65)
at sun.security.krb5.internal.NetClient.getInstance(NetClient.java:43)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:372)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:343)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.krb5.KdcComm.send(KdcComm.java:327)
at sun.security.krb5.KdcComm.send(KdcComm.java:219)
at sun.security.krb5.KdcComm.send(KdcComm.java:191)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:319)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:364)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:735)
... 21 moreThe wizard offered no recourse, or suggestions for how to continue. Going back to the CM "home," I can get the same error every time I try to restart the Cloudera Management Service. I get a different error if I try to restart my cluster first instead, but it also seems to indicate that kerberos is not working:
2015-03-19 21:25:43,255 ERROR org.apache.zookeeper.server.ZooKeeperServerMain: Unexpected exception, exiting abnormally java.io.IOException: Could not configure server because SASL configuration did not allow the ZooKeeper server to authenticate itself properly: javax.security.auth.login.LoginException: Connection refused at org.apache.zookeeper.server.ServerCnxnFactory.configureSaslLogin(ServerCnxnFactory.java:207) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87) at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:116) at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91) at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:121) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
However, the "security inspector" tool executes successfully, and all the principals listed under the Kerberos->credentials page match the principals I can see on my KDC's kadmin shell.
Any idea what's gone wrong, or how I can troubleshoot it?
Created 03-20-2015 08:45 AM
Update: I found the solution on another user's post here: https://community.cloudera.com/t5/Cloudera-Manager-Installation/aftper-config-kerberos-on-CDH5-Servi...
Looks like cloudera manager only uses TCP, even though it is not recommended to use TCP on your KDC, as there is little protection against denail-of-service attacks (http://linux.die.net/man/5/kdc.conf). Why is this not noted in the documentation as a requirement for the KDC?
Created 03-19-2015 05:18 PM
Created 03-20-2015 07:24 AM
I had CM create and deploy the KRB5.conf files originally. I've tried modifying them manually a few times since, since I can no longer use CM to push changes to them while the services refuse to start. The content of the file is:
[libdefaults]
default_realm = MY.REALM.COM
dns_lookup_kdc = true
dns_lookup_realm = false
ticket_lifetime = 36000
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = aes256-cts:normal
default_tkt_enctypes = aes256-cts:normal
permitted_enctypes = aes256-cts:normal
udp_preference_limit = 1
[realms]
MY.REALM.COM = {
kdc = MY.KDC.HOST.COM
admin_server = MY.KDC.HOST.COM
default_domain = MY.KDC.HOST.COM
}
[domain_realm]
.my.realm.com = MY.REALM.COM
my.realm.com = MY.REALM.COM
[logging]
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmin.log
default = FILE:/var/log/krb5lib.log
includedir /etc/krb5.conf.d/I do have AES strong encryption in use, but both the security jars are present:
# ls -lah /usr/java/latest/jre/lib/security/*.jar -rw-rw-r-- 1 root root 2.5K May 31 2011 /usr/java/latest/jre/lib/security/local_policy.jar -rw-rw-r-- 1 root root 2.5K May 31 2011 /usr/java/latest/jre/lib/security/US_export_policy.jar
Lookups also seem to be returning as expected. Where should I go from here?
Created 03-20-2015 08:45 AM
Update: I found the solution on another user's post here: https://community.cloudera.com/t5/Cloudera-Manager-Installation/aftper-config-kerberos-on-CDH5-Servi...
Looks like cloudera manager only uses TCP, even though it is not recommended to use TCP on your KDC, as there is little protection against denail-of-service attacks (http://linux.die.net/man/5/kdc.conf). Why is this not noted in the documentation as a requirement for the KDC?
Created 05-06-2015 04:56 PM
Because we expect the KDC in a hadoop environment to be protected from this by network design.... the reason we hard set TCP rather than UDP is that in complex network environments, we can drive failures within the cluster. IMHO this is valid for internet/hostile network environments, which we strongly reccomend against being used for a cluster environments.
Created 05-06-2015 05:01 PM
Also DoS issue with TCP seems to have been resolved as of krb5 1.10 - https://krbdev.mit.edu/rt/Ticket/Display.html?id=1316
Created 05-06-2015 05:02 PM
We can evaluate expanding the example and definition of what we do by default as well, thanks for pointing this out, we'll continue to review.
Created 02-21-2020 02:58 AM
Try Setting up "udp" for the Kerberos Clients.
/etc/krb5.conf
[libdefaults]
udp_preference_limit = 1