Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kerberos ticket isn't being renewed by Solr when storing indexes on HDFS

avatar
New Contributor

Hello,

I am running HDP Search's Solr Cloud on HDP 2.3 with it configured to store its index files on a Kerberized HDFS. When Solr is started it is able to write index files correctly to HDFS, however, after 24 hours have elapsed Solr becomes unable to connect to HDFS as it says it doesn't have a valid Kerberos tgt anymore (my default Kerberos ticket lifetime is 24 hours).

A restart of Solr Cloud resolves the issue again for another 24 hours, so it appears that Solr is not renewing its Kerberos ticket itself when it expires. Could this be an issue with Solr? Or is there configuration I can add to get Solr to automatically renew its ticket?

This is stack trace I'm getting in the Solr log:

java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for solr/sandbox.hortonworks.com@HORTONWORKS.COM to sandbox.hortonworks.com/10.0.2.15:8020; Host Details : local host is: "sandbox.hortonworks.com/10.0.2.15"; destination host is: "sandbox.hortonworks.com":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy10.renewLease(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:571)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy11.renewLease(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:879)
        at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
        at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
        at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
        at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't setup connection for solr/sandbox.hortonworks.com@HORTONWORKS.COM to sandbox.hortonworks.com/10.0.2.15:8020
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:672)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 16 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
        at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
        at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
        at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
        ... 19 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
        at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
        at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
        ... 28 more

This is the configuration I'm using for my collection:

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
    <str name="solr.hdfs.home">hdfs://sandbox.hortonworks.com/user/solr</str>
    <str name="solr.hdfs.confdir">/usr/hdp/current/hadoop-client/conf</str>
    <bool name="solr.hdfs.blockcache.enabled">true</bool>
    <int name="solr.hdfs.blockcache.slab.count">1</int>
    <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
    <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
    <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
    <bool name="solr.hdfs.blockcache.write.enabled">false</bool>
    <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
    <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
    <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
    <bool name="solr.hdfs.security.kerberos.enabled">true</bool>
    <str name="solr.hdfs.security.kerberos.keytabfile">/etc/solr/conf/solr.keytab</str>
    <str name="solr.hdfs.security.kerberos.principal">solr/sandbox.hortonworks.com@HORTONWORKS.COM</str>
</directoryFactory>
1 ACCEPTED SOLUTION

avatar

The Hadoop RPC client is coded to re-login from the keytab automatically if it detects an RPC call has failed due to a SASL authentication failure. There is no requirement for special configuration or for the applications (Solr in this case) to write special code to trigger this re-login. I recently wrote a detailed description of this behavior on Stack Overflow.

If this is not working in your environment, and you start seeing authentication failures after a process runs 24 hours, then I recommend reviewing Apache JIRA HADOOP-10786. This was a bug that impacted the automatic re-login from keytab on certain JDK versions. On the JDK 7 line, I know the problem was introduced in JDK 1.7.0_80. On the JDK 8 line, I'm not certain which exact JDK release introduced the problem.

If after reviewing HADOOP-10786 you suspect this is the root cause, then you can fix it by either downgrading the JDK to 1.7.0_79 or upgrading Hadoop. The HADOOP-10786 patch changes the Hadoop code so that it will work correctly with all known JDK versions. For Apache Hadoop, the fix shipped in version 2.6.1 and 2.7.0. For HDP, the fix shipped in version 2.2.8.0 and 2.3.0.0. All subsequent versions would have the fix too.

View solution in original post

4 REPLIES 4

avatar

The Hadoop RPC client is coded to re-login from the keytab automatically if it detects an RPC call has failed due to a SASL authentication failure. There is no requirement for special configuration or for the applications (Solr in this case) to write special code to trigger this re-login. I recently wrote a detailed description of this behavior on Stack Overflow.

If this is not working in your environment, and you start seeing authentication failures after a process runs 24 hours, then I recommend reviewing Apache JIRA HADOOP-10786. This was a bug that impacted the automatic re-login from keytab on certain JDK versions. On the JDK 7 line, I know the problem was introduced in JDK 1.7.0_80. On the JDK 8 line, I'm not certain which exact JDK release introduced the problem.

If after reviewing HADOOP-10786 you suspect this is the root cause, then you can fix it by either downgrading the JDK to 1.7.0_79 or upgrading Hadoop. The HADOOP-10786 patch changes the Hadoop code so that it will work correctly with all known JDK versions. For Apache Hadoop, the fix shipped in version 2.6.1 and 2.7.0. For HDP, the fix shipped in version 2.2.8.0 and 2.3.0.0. All subsequent versions would have the fix too.

avatar
New Contributor

Thanks Chris, changing Solr's JVM to 1.7.0_79 seems to have done the trick.

avatar
Rising Star

@Andrew Bumstead opened SOLR-8538. This was resolved as not a problem since JDK 1.7.0_79 fixed the issue.

If you use JDK 8 and Solr <6.2.0 you will most likely have to manually update hadoop-commons to 2.6.1+. The packaged Solr 5.5.0 and 5.5.1 for HDP 2.3, 2.4, and 2.5 seem to have the original packaged Hadoop 2.6.0 dependencies. If you don't upgrade the Hadoop dependencies under Solr to 2.6.1+, you will most likely get Kerberos ticket renewal issues or you have to follow the steps outlined below by @Jonas Straub to enable full Kerberos authentication.

avatar

Renewing Kerberos tickets does not work in the Solr Standalone (single node) solution at the moment. There is a bug in Solr that prevents the renewal of Kerberos tickets (I have an open ticket regarding this issue).

However the SolrCloud solution should renew its tickets. Below is a sample configuration that will enable Kerberos for SolrCloud and ensure that Solr Kerberos tickets are renewed. (Note: this will also enable Kerberos authentication for the SolrAdmin UI).

/opt/lucidworks-hdpsearch/solr/bin/jaas.conf

Client {
        com.sun.security.auth.module.Krb5LoginModule required
        useKeyTab=true
        keyTab="/etc/security/keytabs/solr.headless.keytab"
        storeKey=true
        debug=true
        principal="solr-mycluster@EXAMPLE.COM";
};

/opt/lucidworks-hdpsearch/solr/bin/solr.in.sh

SOLR_HOST=`hostname -f`
ZK_HOST="zookeeperA.example.com:2181,zookeeperB.example.com:2181,zookeeperC.example.com:2181/solr"
SOLR_KERB_PRINCIPAL=HTTP/${SOLR_HOST}@EXAMPLE.COM
SOLR_KERB_KEYTAB=/etc/security/keytabs/solr-spnego.service.keytab
SOLR_JAAS_FILE=/opt/lucidworks-hdpsearch/solr/bin/jaas.conf
SOLR_AUTHENTICATION_CLIENT_CONFIGURER=org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer
SOLR_AUTHENTICATION_OPTS=" -DauthenticationPlugin=org.apache.solr.security.KerberosPlugin -Djava.security.auth.login.config=${SOLR_JAAS_FILE} -Dsolr.kerberos.principal=${SOLR_KERB_PRINCIPAL} -Dsolr.kerberos.keytab=${SOLR_KERB_KEYTAB} -Dsolr.kerberos.cookie.domain=${SOLR_HOST} -Dhost=${SOLR_HOST} -Dsolr.kerberos.name.rules=RULE:[1:$1@$0](solr-mycluster@EXAMPLE.COM)s/.*/solr/DEFAULT"

/etc/security/keytabs/solr.headless.keytab

Keytab file for the solr-mycluster@EXAMPLE.COM principal

Add the following file and znode to Zookeeper:

/opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost zookeeperA.example.com:2181,zookeeperB.example.com:2181,zookeeperC.example.com:2181 -cmd makepath /solr

/opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost zookeeperA.example.com:2181,zookeeperB.example.com:2181,zookeeperC.example.com:2181 -cmd put /solr/security.json '{"authentication":{"class": "org.apache.solr.security.KerberosPlugin"}}'

Solrconfig.xml

Your solrconfig.xml looks fine.

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
  <str name="solr.hdfs.home">hdfs://mycluster/solr</str>
  <str name="solr.hdfs.confdir">/etc/hadoop/conf</str>
  <bool name="solr.hdfs.security.kerberos.enabled">true</bool>
  <str name="solr.hdfs.security.kerberos.keytabfile">/etc/security/keytabs/solr.headless.keytab</str>
  <str name="solr.hdfs.security.kerberos.principal">solr-mycluster@EXAMPLE.COM</str>
  <bool name="solr.hdfs.blockcache.enabled">true</bool>
  <int name="solr.hdfs.blockcache.slab.count">1</int>
  <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
  <bool name="solr.hdfs.blockcache.write.enabled">true</bool>
  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">64</int>
  <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">512</int>
</directoryFactory>

You can find some more information on this page => https://cwiki.apache.org/confluence/display/RANGER/How+to+configure+Solr+Cloud+with+Kerberos+for+Ran...