Support Questions
Find answers, ask questions, and share your expertise

Unable to start Ambari-Infra in a HDF cluster due to Zeekeeper auth_fail

I have enabled kerberos on the HDF cluster. When starting ambari-infra, it errors out due to zookeeper failure. I have confirmed that the jaas files are updated correctly, and I am able to kinit using both zk.service.keytab and ambari-infra-solr.service.keytab. When solrCloudCli.sh is invoked by Ambari, the following error is reported - "Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7)). org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /clusterprops.json".

I have attached the solr client logs.solor-error-log.txt

Thanks,

1 ACCEPTED SOLUTION

It turned out to be a problem with the file permissions. The umask was not set to 022. Hence it was failing due to access for ambari-infra logs and configurations. The error message was incorrect, as it was pointing to kerberos error.

View solution in original post

9 REPLIES 9

@hello hadoop

What do the /etc/hosts files look like on your nodes? I had a similar issue, I had to put the FQDN of the nodes first in the /etc/hosts file on the nodes. For example, I had

12.34.56.78 node1 node1.domain

12.34.56.79 node2 node2.domain

12.34.56.77 node3 node3.domain

When I switched them to

12.34.56.78 node1.domain node1

12.34.56.79 node2.domain node2

12.34.56.77 node3.domain node3

Everything started just fine.

Thank you @Wynner

I have the host files in the format you mention, with FQDN followed by shorter one. However, my hostname is set to shortname (node1) without domain. Would this be an issue?

@hello hadoop

My configuration was the other way, try switching yours to short name first.

I tried both ways, but still the same error. Even zkCli.sh errors with Auth_Failed.

@hello hadoop

What version of HDF are you using?

@Wynner

I am using HDF 2.1.1.0

It turned out to be a problem with the file permissions. The umask was not set to 022. Hence it was failing due to access for ambari-infra logs and configurations. The error message was incorrect, as it was pointing to kerberos error.

Hi @hello hadoop, in which directory of file permission must be changed? I can't find clusterprops.json. Please help.

New Contributor

reversing FQDN and short names in hosts file worked for me.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.