Support Questions
Find answers, ask questions, and share your expertise

OpenTSDB with Kerberos: Cannot renew TGT with kinit -R

I'm using OpenTSDB in a kerberized cluster. I start OpenTSDB as root using

CLASSPATH=$CLASSPATH:/home/applications/opentsdb/conf/ JVMARGS="${JVMARGS} -enableassertions -enablesystemassertions -Djava.security.auth.login.config=/home/applications/opentsdb/conf/opentsdb_client_jaas.conf" /home/applications/opentsdb/opentsdb-2.3.0/build/tsdb tsd --config /home/applications/opentsdb/conf/opentsdb.conf

The jaas config file looks like this:

Client {
com.sun.security.auth.module.Krb5LoginModule required debug=false
renewTGT=true
useKeyTab=true
keyTab="/etc/security/keytabs/opentsdb.service.keytab"
principal="opentsdb/host.cluster@XXX.YYY.COM"
useTicketCache=true;
};

Everything starts just fine and in the OpenTSDB log file I see:

Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #3async.auth.KerberosClientAuthProvider
Client will use GSSAPI as SASL mechanism.
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #3async.auth.KerberosClientAuthProvider
Connecting to hbase/host.cluster@XXX.YYY.COM
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.HBaseClient
Added client for region RegionInfo(table="tsdb", region_name="tsdb,,1497401874292.983451b817366a624c42c20e7c91af67.", stop_key="\x0B\x00\t\xD7S2Q"), which was added to the regions cache.  Now we know that RegionClient@785572588(chan=null, #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 1 region.
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #2async.auth.KerberosClientAuthProvider
Client will use GSSAPI as SASL mechanism.
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #2async.auth.KerberosClientAuthProvider
Connecting to hbase/host.cluster@XXX.YYY.COM
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.HBaseClient
Added client for region RegionInfo(table="tsdb-uid", region_name="tsdb-uid,,1482497591937.0049eec9a851bc64e12ed2a0540192eb.", stop_key=""), which was added to the regions cache.  Now we know that RegionClient@599240979(chan=null, #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 1 region.
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.SecureRpcHelper96
SASL client context established. Negotiated QoP: auth on for: RegionClient@159145664(chan=null, #pending_rpcs=2, #batched=0, #rpcs_inflight=0)
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.RegionClient
Initialized security helper: org.hbase.async.SecureRpcHelper96@4ce85bd8 for region client: RegionClient@159145664(chan=null, #pending_rpcs=2, #batched=0, #rpcs_inflight=0)
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.KerberosClientAuthProvider
Client will use GSSAPI as SASL mechanism.
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.KerberosClientAuthProvider
Connecting to hbase/host.cluster@XXX.YYY.COM
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login
Initialized kerberos login context
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login
Scheduled ticket renewal in 29266667 ms
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login
TGT expires:                  Fri Jun 16 09:55:12 CEST 2017
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login
TGT valid starting at:        Thu Jun 15 23:55:12 CEST 2017
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login
Successfully logged in

The TGT is granted for 10 hours. OpenTSDB says that it will try and renew the TGT in a little over 8 hours. When it does try and renew the TGT I see the following:

Thu Jun 15 06:26:39 GMT+200 2017ERRORAsyncHBase Timer HBaseClient #1async.auth.Login
Failed to renew ticketjava.lang.RuntimeException: Could not renew TGT due to problem running shell command: '/usr/bin/kinit -R';
at org.hbase.async.auth.Login.refreshTicketCache(Login.java:340) ~[asynchbase-1.7.2.jar:na]
at org.hbase.async.auth.Login.access$100(Login.java:61) ~[asynchbase-1.7.2.jar:na]
at org.hbase.async.auth.Login$TicketRenewalTask.run(Login.java:386) ~[asynchbase-1.7.2.jar:na]
at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:556) [netty-3.9.4.Final.jar:na]
at org.jboss.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:632) [netty-3.9.4.Final.jar:na]
at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:369) [netty-3.9.4.Final.jar:na]
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: org.apache.zookeeper.Shell$ExitCodeException: kinit: No credentials cache found (filename: /tmp/krb5cc_0) while renewing credentials
at org.apache.zookeeper.Shell.runCommand(Shell.java:225) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.Shell.run(Shell.java:152) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.Shell$ShellCommandExecutor.execute(Shell.java:345) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.Shell.execCommand(Shell.java:431) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.Shell.execCommand(Shell.java:414) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.hbase.async.auth.Login.refreshTicketCache(Login.java:338) ~[asynchbase-1.7.2.jar:na]
... 7 common frames omitted

This part:

Caused by: org.apache.zookeeper.Shell$ExitCodeException: kinit: No credentials cache found (filename: /tmp/krb5cc_0) while renewing credentials

leads me to think that it's trying to renew the TGT for the root user and isn't using the OpenTSDB keytab files. How can I get this to work properly?

Can I set

useTicketCache-false;

in the jaas file? I do not have an OpenTSDB user on the cluster, only the service principals exist in the AD.

7 REPLIES 7

@Dr. Jason Breitweg

I believe that you should not be using the interactive user ticket cache and allow JAAS to manage that. The JAAS conf file should look like:

Client {
  com.sun.security.auth.module.Krb5LoginModule required 
  debug=false
  renewTGT=false
  useKeyTab=false
  storeKey=true
  keyTab="/etc/security/keytabs/opentsdb.service.keytab"
  principal="opentsdb/host.cluster@XXX.YYY.COM"
  useTicketCache=true;
};

I was thinking of trying:

Client {
  com.sun.security.auth.module.Krb5LoginModule required 
  debug=false
  renewTGT=true
  useKeyTab=true
  storeKey=true
  keyTab="/etc/security/keytabs/opentsdb.service.keytab"
  principal="opentsdb/host.cluster@XXX.YYY.COM"
  useTicketCache=false;
};

Wouldn't this then try to renew the TGT but instead of using the ticket cache it would just use the keytab file instead?

The renewTGT option is not valid if not using the ticket cache. From Krb5LoginModule Docs:

renewTGT:
 Set this to true, if you want to renew the TGT. If this is set, useTicketCache must also be set to true; otherwise a configuration error will be returned.

So the config I specified above should be what you need.

But this is the whole problem, isn't it? OpenTSDB is running as root; hence trying to renew from the /tmp/krb5cc_0 file:

Causedby: org.apache.zookeeper.Shell$ExitCodeException: kinit:No credentials cache found (filename:/tmp/krb5cc_0)while renewing credentials

But there is no /tmp/krb5cc_0 file since when starting OpenTSDB is just reads from the keytab and never saves the TGT info into a krb5cc_uid file. Because of this the renewal won't work at all. I'm beginning to think I need to use:

Client {
com.sun.security.auth.module.Krb5LoginModule required debug=false
renewTGT=false
useKeyTab=true
keyTab="/etc/security/keytabs/opentsdb.service.keytab"
principal="opentsdb/host.cluster@XXX.YYY.COM"
useTicketCache=false;
};

But then I'm not sure what will happen when the TGT expires; is the system clever enough to get the TGT from the keytab again?

What does storeKey=true do in your example file?

Something else must be wrong. When using JAAS and not using the ticket cache, JAAS handles the caching and renewal internally... I am not sure of the details, though. Are you sure your JAAS configuration file is being used by the process?

After testing it has turned out that this

Client {
com.sun.security.auth.module.Krb5LoginModule required
debug=false
renewTGT=false
useKeyTab=true
keyTab="/etc/security/keytabs/opentsdb.service.keytab"
principal="opentsdb/host.cluster@XXX.YYY.COM"
useTicketCache=false;
};

is indeed what works.

When starting OpenTSDB I see this:

2017-06-22 13:52:57,366 INFO  [Thread-1] Login: TGT refresh thread started.
2017-06-22 13:52:57,376 INFO  [Thread-1] Login: TGT valid starting at:        Thu Jun 22 13:52:57 CEST 2017
2017-06-22 13:52:57,376 INFO  [Thread-1] Login: TGT expires:                  Thu Jun 22 23:52:57 CEST 2017
2017-06-22 13:52:57,376 INFO  [Thread-1] Login: TGT refresh sleeping until: Thu Jun 22 22:06:24 CEST 2017

And then it refreshes the TGT when it said it would (22:06):

2017-06-22 22:06:24,667 INFO  [Thread-1] Login: Initiating logout for opentsdb/host.cluster@XXX.YYY.COM
2017-06-22 22:06:24,668 INFO  [Thread-1] Login: Initiating re-login for opentsdb/host.cluster@XXX.YYY.COM
2017-06-22 22:06:24,677 INFO  [Thread-1] Login: TGT valid starting at:        Thu Jun 22 22:06:24 CEST 2017
2017-06-22 22:06:24,677 INFO  [Thread-1] Login: TGT expires:                  Fri Jun 23 08:06:24 CEST 2017
2017-06-22 22:06:24,677 INFO  [Thread-1] Login: TGT refresh sleeping until: Fri Jun 23 06:19:34 CEST 2017

The other two variations of the JAAS file that I tried ended up complaining because of various kinit errors.

Yikes... it appears that I had an error in the JAAS config that I posted. It was a typo on my part. However, I am glad you found the issue and fixed it.

I accidentally had

useKeyTab=false

where the proper value was supposed to be

useKeyTab=true

My apologies.