Created on 06-16-2017 11:22 AM - edited 09-16-2022 04:46 AM
I'm using OpenTSDB in a kerberized cluster. I start OpenTSDB as root using
CLASSPATH=$CLASSPATH:/home/applications/opentsdb/conf/ JVMARGS="${JVMARGS} -enableassertions -enablesystemassertions -Djava.security.auth.login.config=/home/applications/opentsdb/conf/opentsdb_client_jaas.conf" /home/applications/opentsdb/opentsdb-2.3.0/build/tsdb tsd --config /home/applications/opentsdb/conf/opentsdb.conf
The jaas config file looks like this:
Client { com.sun.security.auth.module.Krb5LoginModule required debug=false renewTGT=true useKeyTab=true keyTab="/etc/security/keytabs/opentsdb.service.keytab" principal="opentsdb/host.cluster@XXX.YYY.COM" useTicketCache=true; };
Everything starts just fine and in the OpenTSDB log file I see:
Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #3async.auth.KerberosClientAuthProvider Client will use GSSAPI as SASL mechanism. Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #3async.auth.KerberosClientAuthProvider Connecting to hbase/host.cluster@XXX.YYY.COM Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.HBaseClient Added client for region RegionInfo(table="tsdb", region_name="tsdb,,1497401874292.983451b817366a624c42c20e7c91af67.", stop_key="\x0B\x00\t\xD7S2Q"), which was added to the regions cache. Now we know that RegionClient@785572588(chan=null, #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 1 region. Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #2async.auth.KerberosClientAuthProvider Client will use GSSAPI as SASL mechanism. Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #2async.auth.KerberosClientAuthProvider Connecting to hbase/host.cluster@XXX.YYY.COM Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.HBaseClient Added client for region RegionInfo(table="tsdb-uid", region_name="tsdb-uid,,1482497591937.0049eec9a851bc64e12ed2a0540192eb.", stop_key=""), which was added to the regions cache. Now we know that RegionClient@599240979(chan=null, #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 1 region. Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.SecureRpcHelper96 SASL client context established. Negotiated QoP: auth on for: RegionClient@159145664(chan=null, #pending_rpcs=2, #batched=0, #rpcs_inflight=0) Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.RegionClient Initialized security helper: org.hbase.async.SecureRpcHelper96@4ce85bd8 for region client: RegionClient@159145664(chan=null, #pending_rpcs=2, #batched=0, #rpcs_inflight=0) Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.KerberosClientAuthProvider Client will use GSSAPI as SASL mechanism. Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.KerberosClientAuthProvider Connecting to hbase/host.cluster@XXX.YYY.COM Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login Initialized kerberos login context Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login Scheduled ticket renewal in 29266667 ms Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login TGT expires: Fri Jun 16 09:55:12 CEST 2017 Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login TGT valid starting at: Thu Jun 15 23:55:12 CEST 2017 Thu Jun 15 23:55:12 GMT+200 2017INFOAsyncHBase I/O Worker #1async.auth.Login Successfully logged in
The TGT is granted for 10 hours. OpenTSDB says that it will try and renew the TGT in a little over 8 hours. When it does try and renew the TGT I see the following:
Thu Jun 15 06:26:39 GMT+200 2017ERRORAsyncHBase Timer HBaseClient #1async.auth.Login Failed to renew ticketjava.lang.RuntimeException: Could not renew TGT due to problem running shell command: '/usr/bin/kinit -R'; at org.hbase.async.auth.Login.refreshTicketCache(Login.java:340) ~[asynchbase-1.7.2.jar:na] at org.hbase.async.auth.Login.access$100(Login.java:61) ~[asynchbase-1.7.2.jar:na] at org.hbase.async.auth.Login$TicketRenewalTask.run(Login.java:386) ~[asynchbase-1.7.2.jar:na] at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:556) [netty-3.9.4.Final.jar:na] at org.jboss.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:632) [netty-3.9.4.Final.jar:na] at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:369) [netty-3.9.4.Final.jar:na] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111] Caused by: org.apache.zookeeper.Shell$ExitCodeException: kinit: No credentials cache found (filename: /tmp/krb5cc_0) while renewing credentials at org.apache.zookeeper.Shell.runCommand(Shell.java:225) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.Shell.run(Shell.java:152) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.Shell$ShellCommandExecutor.execute(Shell.java:345) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.Shell.execCommand(Shell.java:431) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.Shell.execCommand(Shell.java:414) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.hbase.async.auth.Login.refreshTicketCache(Login.java:338) ~[asynchbase-1.7.2.jar:na] ... 7 common frames omitted
This part:
Caused by: org.apache.zookeeper.Shell$ExitCodeException: kinit: No credentials cache found (filename: /tmp/krb5cc_0) while renewing credentials
leads me to think that it's trying to renew the TGT for the root user and isn't using the OpenTSDB keytab files. How can I get this to work properly?
Can I set
useTicketCache-false;
in the jaas file? I do not have an OpenTSDB user on the cluster, only the service principals exist in the AD.
Created 06-16-2017 02:13 PM
I believe that you should not be using the interactive user ticket cache and allow JAAS to manage that. The JAAS conf file should look like:
Client { com.sun.security.auth.module.Krb5LoginModule required debug=false renewTGT=false useKeyTab=false storeKey=true keyTab="/etc/security/keytabs/opentsdb.service.keytab" principal="opentsdb/host.cluster@XXX.YYY.COM" useTicketCache=true; };
Created 06-20-2017 05:55 AM
I was thinking of trying:
Client { com.sun.security.auth.module.Krb5LoginModule required debug=false renewTGT=true useKeyTab=true storeKey=true keyTab="/etc/security/keytabs/opentsdb.service.keytab" principal="opentsdb/host.cluster@XXX.YYY.COM" useTicketCache=false; };
Wouldn't this then try to renew the TGT but instead of using the ticket cache it would just use the keytab file instead?
Created 06-20-2017 02:37 PM
The renewTGT option is not valid if not using the ticket cache. From Krb5LoginModule Docs:
renewTGT: Set this to true, if you want to renew the TGT. If this is set, useTicketCache must also be set to true; otherwise a configuration error will be returned.
So the config I specified above should be what you need.
Created 06-22-2017 07:22 AM
But this is the whole problem, isn't it? OpenTSDB is running as root; hence trying to renew from the /tmp/krb5cc_0 file:
Causedby: org.apache.zookeeper.Shell$ExitCodeException: kinit:No credentials cache found (filename:/tmp/krb5cc_0)while renewing credentials
But there is no /tmp/krb5cc_0 file since when starting OpenTSDB is just reads from the keytab and never saves the TGT info into a krb5cc_uid file. Because of this the renewal won't work at all. I'm beginning to think I need to use:
Client { com.sun.security.auth.module.Krb5LoginModule required debug=false renewTGT=false useKeyTab=true keyTab="/etc/security/keytabs/opentsdb.service.keytab" principal="opentsdb/host.cluster@XXX.YYY.COM" useTicketCache=false; };
But then I'm not sure what will happen when the TGT expires; is the system clever enough to get the TGT from the keytab again?
What does storeKey=true do in your example file?
Created 06-22-2017 01:29 PM
Something else must be wrong. When using JAAS and not using the ticket cache, JAAS handles the caching and renewal internally... I am not sure of the details, though. Are you sure your JAAS configuration file is being used by the process?
Created 06-23-2017 05:05 AM
After testing it has turned out that this
Client { com.sun.security.auth.module.Krb5LoginModule required debug=false renewTGT=false useKeyTab=true keyTab="/etc/security/keytabs/opentsdb.service.keytab" principal="opentsdb/host.cluster@XXX.YYY.COM" useTicketCache=false; };
is indeed what works.
When starting OpenTSDB I see this:
2017-06-22 13:52:57,366 INFO [Thread-1] Login: TGT refresh thread started. 2017-06-22 13:52:57,376 INFO [Thread-1] Login: TGT valid starting at: Thu Jun 22 13:52:57 CEST 2017 2017-06-22 13:52:57,376 INFO [Thread-1] Login: TGT expires: Thu Jun 22 23:52:57 CEST 2017 2017-06-22 13:52:57,376 INFO [Thread-1] Login: TGT refresh sleeping until: Thu Jun 22 22:06:24 CEST 2017
And then it refreshes the TGT when it said it would (22:06):
2017-06-22 22:06:24,667 INFO [Thread-1] Login: Initiating logout for opentsdb/host.cluster@XXX.YYY.COM 2017-06-22 22:06:24,668 INFO [Thread-1] Login: Initiating re-login for opentsdb/host.cluster@XXX.YYY.COM 2017-06-22 22:06:24,677 INFO [Thread-1] Login: TGT valid starting at: Thu Jun 22 22:06:24 CEST 2017 2017-06-22 22:06:24,677 INFO [Thread-1] Login: TGT expires: Fri Jun 23 08:06:24 CEST 2017 2017-06-22 22:06:24,677 INFO [Thread-1] Login: TGT refresh sleeping until: Fri Jun 23 06:19:34 CEST 2017
The other two variations of the JAAS file that I tried ended up complaining because of various kinit errors.
Created 06-23-2017 09:40 AM
Yikes... it appears that I had an error in the JAAS config that I posted. It was a typo on my part. However, I am glad you found the issue and fixed it.
I accidentally had
useKeyTab=false
where the proper value was supposed to be
useKeyTab=true
My apologies.