Hi,
recently a problem with HBase Master and Region Servers arose.
The error in CM is following:
The Cloudera Manager Agent is not able to communicate with this role's web server.
for master and all three workers as well.
The problem is connected with Kerberos (from my guess after investigating logs):
Authentication exception: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
So far I tried:
- Restart the cluster completely
- Regenerate Kerberos credentials:
- Stop Cluster
- Go to Administration->Security->Kerberos Credentials
- Select all and Regenerate selected
- Start Cluster
- Regenerate Keytab
- Stop Cluster
- Go to Hosts->All hosts
- Select all hosts and select Regenerate Keytab from Actions
- Start Cluster
- Regenerate Keytab and then regenerate Kerberos credentials
- point 3 and then point 2
- Manually try of kinit
- I have successfully kinit on
- HTTP on HBase master node and workers
- HBase on HBase master node and workers
- No problem with kinit with current keytab in /var/run/cloudera-scm-agent/process
So far no luck with this problem. I have no ideas what to do next.
We are using CDH 6.3.1.
Thank you for help.
Michal