Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

Ambari Infra (Solr) doesn't start after cluster kerberization

I kerberized our cluster and everything looks good till Ambari Infra starts up. I've got the following error:

--security-json-location /etc/ambari-infra-solr/conf/security.json' returned 1. Using default ZkCredentialsProvider
Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
Client environment:host.name=hdp-cluster-master1.apollon.mydomain.com
Client environment:java.version=1.8.0_131
Client environment:java.vendor=Oracle Corporation
Client environment:java.home=/usr/java/jdk1.8.0_131/jre
Client environment:java.class.path=/usr/lib/ambari-infra-solr-client:/usr/lib/ambari-infra-solr-client/libs/ambari-logsearch-solr-client-2.5.0.3.7.jar:/usr/lib/ambari-infra-solr-client/libs/antlr-2.7.7.jar:/usr/lib/ambari-infra-solr-client/libs/antlr4-runtime-4.5.3.jar:/usr/lib/ambari-infra-solr-client/libs/checkstyle-6.19.jar:/usr/lib/ambari-infra-solr-client/libs/commons-beanutils-1.9.2.jar:/usr/lib/ambari-infra-solr-client/libs/commons-cli-1.3.1.jar:/usr/lib/ambari-infra-solr-client/libs/commons-codec-1.8.jar:/usr/lib/ambari-infra-solr-client/libs/commons-collections-3.2.2.jar:/usr/lib/ambari-infra-solr-client/libs/commons-io-2.1.jar:/usr/lib/ambari-infra-solr-client/libs/commons-lang-2.5.jar:/usr/lib/ambari-infra-solr-client/libs/commons-logging-1.1.1.jar:/usr/lib/ambari-infra-solr-client/libs/easymock-3.4.jar:/usr/lib/ambari-infra-solr-client/libs/guava-16.0.jar:/usr/lib/ambari-infra-solr-client/libs/hamcrest-core-1.1.jar:/usr/lib/ambari-infra-solr-client/libs/httpclient-4.4.1.jar:/usr/lib/ambari-infra-solr-client/libs/httpcore-4.4.1.jar:/usr/lib/ambari-infra-solr-client/libs/httpmime-4.4.1.jar:/usr/lib/ambari-infra-solr-client/libs/jackson-core-asl-1.9.9.jar:/usr/lib/ambari-infra-solr-client/libs/jackson-mapper-asl-1.9.13.jar:/usr/lib/ambari-infra-solr-client/libs/jcl-over-slf4j-1.7.7.jar:/usr/lib/ambari-infra-solr-client/libs/junit-4.10.jar:/usr/lib/ambari-infra-solr-client/libs/log4j-1.2.17.jar:/usr/lib/ambari-infra-solr-client/libs/noggit-0.6.jar:/usr/lib/ambari-infra-solr-client/libs/objenesis-2.2.jar:/usr/lib/ambari-infra-solr-client/libs/slf4j-api-1.7.2.jar:/usr/lib/ambari-infra-solr-client/libs/slf4j-log4j12-1.7.2.jar:/usr/lib/ambari-infra-solr-client/libs/solr-solrj-5.5.2.jar:/usr/lib/ambari-infra-solr-client/libs/stax2-api-3.1.4.jar:/usr/lib/ambari-infra-solr-client/libs/tools-1.7.0.jar:/usr/lib/ambari-infra-solr-client/libs/utility-1.0.0.0-SNAPSHOT.jar:/usr/lib/ambari-infra-solr-client/libs/woodstox-core-asl-4.4.1.jar:/usr/lib/ambari-infra-solr-client/libs/zookeeper-3.4.6.jar
Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
Client environment:java.io.tmpdir=/tmp
Client environment:java.compiler=<NA>
Client environment:os.name=Linux
Client environment:os.arch=amd64
Client environment:os.version=3.10.0-514.21.1.el7.x86_64
Client environment:user.name=root
Client environment:user.home=/root
Client environment:user.dir=/var/lib/ambari-agent
Initiating client connection, connectString=hdp-cluster-master1.apollon.mydomain.com:2181,hdp-cluster-master2.apollon.mydomain.com:2181,hdp-cluster-master3.apollon.mydomain.com:2181 sessionTimeout=15000 watcher=org.apache.solr.common.cloud.SolrZkClient$3@512ddf17
Waiting for client to connect to ZooKeeper
successfully logged in.
TGT refresh thread started.
Client will use GSSAPI as SASL mechanism.
TGT valid starting at:        Tue May 30 11:38:36 CEST 2017
TGT expires:                  Tue May 30 21:38:36 CEST 2017
TGT refresh sleeping until: Tue May 30 19:41:46 CEST 2017
Opening socket connection to server ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181. Will attempt to SASL-authenticate using Login Context section 'Client'
Socket connection established to ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181, initiating session
Session establishment complete on server ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181, sessionid = 0x35c584466420007, negotiated timeout = 15000
Watcher org.apache.solr.common.cloud.ConnectionManager@3a376007 name:ZooKeeperConnection Watcher:hdp-cluster-master1.apollon.mydomain.com:2181,hdp-cluster-master2.apollon.mydomain.com:2181,hdp-cluster-master3.apollon.mydomain.com:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
Client is connected to ZooKeeper
Using default ZkACLProvider
Setup kerberos plugin in security.json
An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
Watcher org.apache.solr.common.cloud.ConnectionManager@3a376007 name:ZooKeeperConnection Watcher:hdp-cluster-master1.apollon.mydomain.com:2181,hdp-cluster-master2.apollon.mydomain.com:2181,hdp-cluster-master3.apollon.mydomain.com:2181 got event WatchedEvent state:AuthFailed type:None path:null path:null type:None
zkClient received AuthFailed
KeeperErrorCode = AuthFailed for /infra-solr/security.json
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /infra-solr/security.json
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
    at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:311)
    at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:308)
    at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
    at org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:308)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.getFileContentFromZnode(EnableKerberosPluginSolrZkCommand.java:71)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.executeZkCommand(EnableKerberosPluginSolrZkCommand.java:45)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.executeZkCommand(EnableKerberosPluginSolrZkCommand.java:32)
    at org.apache.ambari.logsearch.solr.commands.AbstractZookeeperRetryCommand.createAndProcessRequest(AbstractZookeeperRetryCommand.java:38)
    at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.retry(AbstractRetryCommand.java:45)
    at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.run(AbstractRetryCommand.java:40)
    at org.apache.ambari.logsearch.solr.AmbariSolrCloudClient.setupKerberosPlugin(AmbariSolrCloudClient.java:162)
    at org.apache.ambari.logsearch.solr.AmbariSolrCloudCLI.main(AmbariSolrCloudCLI.java:518)
KeeperErrorCode = AuthFailed for /infra-solr/security.json
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /infra-solr/security.json
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
    at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:311)
    at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:308)
    at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
    at org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:308)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.getFileContentFromZnode(EnableKerberosPluginSolrZkCommand.java:71)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.executeZkCommand(EnableKerberosPluginSolrZkCommand.java:45)
    at org.apache.ambari.logsearch.solr.commands.EnableKerberosPluginSolrZkCommand.executeZkCommand(EnableKerberosPluginSolrZkCommand.java:32)
    at org.apache.ambari.logsearch.solr.commands.AbstractZookeeperRetryCommand.createAndProcessRequest(AbstractZookeeperRetryCommand.java:38)
    at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.retry(AbstractRetryCommand.java:45)
    at org.apache.ambari.logsearch.solr.commands.AbstractRetryCommand.run(AbstractRetryCommand.java:40)
    at org.apache.ambari.logsearch.solr.AmbariSolrCloudClient.setupKerberosPlugin(AmbariSolrCloudClient.java:162)
    at org.apache.ambari.logsearch.solr.AmbariSolrCloudCLI.main(AmbariSolrCloudCLI.java:518)
Command failed, tries again (tries: 1)
KeeperErrorCode = AuthFailed for /infra-solr/security.json

looks like that the ambari solr client can't access the ZooKeeper server quorum. I test it with the zkClient and and the following zookeeper_client_jaas.conf file:

Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
useTicketCache=false
keyTab="/etc/security/keytabs/zk.service.keytab"
principal="zookeeper/hdp-cluster-master1.apollon.mydomain.com@MYDOMAIN.MYDOMAINROOT.NET";
};

Then I start the zookeeper client with

/usr/hdp/current/zookeeper-server/bin/zkCli.sh

or

/usr/hdp/current/zookeeper-client/bin/zookeeper-client -server hdp-cluster-master1.apollon.mydomain.com:2181,hdp-cluster-master2.apollon.mydomain.com:2181,hdp-cluster-master3.apollon.mydomain.com:2181

but the zookeeper clients starts with the same error message:

Welcome to ZooKeeper!
JLine support is enabled
[zk: hdp-cluster-master1.apollon.tchibo.com:2181,hdp-cluster-master2.apollon.tchibo.com:2181,hdp-cluster-master3.apollon.tchibo.com:2181(CONNECTING) 0] 2017-05-30 13:37:08,418 - INFO  [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):Login@294] - successfully logged in.
2017-05-30 13:37:08,419 - INFO  [Thread-1:Login$1@127] - TGT refresh thread started.
2017-05-30 13:37:08,423 - INFO  [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ZooKeeperSaslClient$1@289] - Client will use GSSAPI as SASL mechanism.
2017-05-30 13:37:08,429 - INFO  [Thread-1:Login@302] - TGT valid starting at:        Tue May 30 13:37:38 CEST 2017
2017-05-30 13:37:08,429 - INFO  [Thread-1:Login@303] - TGT expires:                  Tue May 30 23:37:38 CEST 2017
2017-05-30 13:37:08,429 - INFO  [Thread-1:Login$1@181] - TGT refresh sleeping until: Tue May 30 22:04:12 CEST 2017
2017-05-30 13:37:08,431 - INFO  [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181. Will attempt to SASL-authenticate using Login Context section 'Client'
2017-05-30 13:37:08,435 - INFO  [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ClientCnxn$SendThread@864] - Socket connection established to ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181, initiating session
2017-05-30 13:37:08,442 - INFO  [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server ip-10-125-160-147.eu-central-1.compute.internal/10.125.160.147:2181, sessionid = 0x35c590dbb920004, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
2017-05-30 13:37:08,475 - ERROR [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ZooKeeperSaslClient@388] - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
2017-05-30 13:37:08,475 - ERROR [main-SendThread(ip-10-125-160-147.eu-central-1.compute.internal:2181):ClientCnxn$SendThread@1059] - SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.

It looks like that this are also the same problem like the Ambari Infra (Solr) problem comes from. I'm using HDP 2.6.03. Anyone who solved it?

2 REPLIES 2

Explorer

Hello @Ramon Wartala ,

I too have similar issue. are you able to resolve this ?

Explorer

Hi Ramon, I have the same issue, did you find the solution?