Support Questions

Find answers, ask questions, and share your expertise

HBase Web server problem with Kerberos

avatar
New Contributor

Hi,

recently a problem with HBase Master and Region Servers arose. 

The error in CM is following:

The Cloudera Manager Agent is not able to communicate with this role's web server. 

for master and all three workers as well.

 

The problem is connected with Kerberos (from my guess after investigating logs):

Authentication exception: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)

 

So far I tried:

  1. Restart the cluster completely
  2. Regenerate Kerberos credentials:
    1. Stop Cluster
    2. Go to Administration->Security->Kerberos Credentials
    3. Select all and Regenerate selected
    4. Start Cluster
  3. Regenerate Keytab
    1. Stop Cluster
    2. Go to Hosts->All hosts
    3. Select all hosts and select Regenerate Keytab from Actions
    4. Start Cluster
  4. Regenerate Keytab and then regenerate Kerberos credentials
    1. point 3 and then point 2
  5. Manually try of kinit
    1. I have successfully kinit on
      1. HTTP on HBase master node and workers
      2. HBase on HBase master node and workers
    2. No problem with kinit with current keytab in /var/run/cloudera-scm-agent/process

So far no luck with this problem. I have no ideas what to do next. 

 

We are using CDH 6.3.1.

 

Thank you for help.

Michal

1 REPLY 1

avatar
Master Collaborator

Checking CM agent log is a good start to this issue.

The Cloudera Manager Agent is not able to communicate with this role's web server. 

You should firstly check if there are network issues between CM and HBase hosts.

E.g. Are krb5.conf and /etc/hosts consistent

We don't know where did you see below exception:

Authentication exception: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)

And this KB (https://my.cloudera.com/knowledge/Troubleshooting-Kerberos-Related-Issues-Common-Errors-and?id=76192) is a good start to troubleshoot kerberos related issues, see the symptom 15 and 18.