10-06-2017 07:45 AM - edited 10-06-2017 07:48 AM
I'm trying to move the Cloudera Management Services to another host following those steps: https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_ag_restore_server.html
It all works fine until the point in which I have to start the new services (Activty Monitor, Host Manger, etc).
When I try to start them they fail saying:
Command failed to run because this role has invalid configuration. Review and correct its configuration. First error: Role is missing Kerberos keytab. Go to the Kerberos Credentials page and click the Generate Missing Credentials button.
Then when I go to "Security", "Kerberos credentials" and click on "Generate Missing Credentials" I get
/usr/share/cmf/bin/gen_credentials.sh failed with exit code 1 and output of << + export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/usr/lib/mit/bin:/usr/bin:/sbin:/usr/sbin:/bin:/usr/bin + PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/usr/lib/mit/bin:/usr/bin:/sbin:/usr/sbin:/bin:/usr/bin + CMF_REALM=EXAMPLE.NET + KEYTAB_OUT=/var/run/cloudera-scm-server/cmf3146936050096402809.keytab + PRINC=hdfs/hadoop-data04.example.net@EXAMPLE.NET + MAX_RENEW_LIFE=432000 + KADMIN='kadmin -k -t /var/run/cloudera-scm-server/cmf5911913375869248594.keytab -p cloudera-scm@EXAMPLE.NET -r EXAMPLE.NET' + RENEW_ARG= + '[' 432000 -gt 0 ']' + RENEW_ARG='-maxrenewlife "432000 sec"' + '[' -z /etc/krb5.conf ']' + echo 'Using custom config path '\''/etc/krb5.conf'\'', contents below:' + cat /etc/krb5.conf + kadmin -k -t /var/run/cloudera-scm-server/cmf5911913375869248594.keytab -p cloudera-scm@EXAMPLE.NET -r EXAMPLE.NET -q 'addprinc -maxrenewlife "432000 sec" -randkey hdfs/hadoop-data04.example.net@EXAMPLE.NET' kadmin: Database error! Required KADM5 principal missing while initializing kadmin interface >>
This is where I get stuck
I already clicked on the "Import Kerberos Account Manager Credentials" button and imported the credentials so that the cloudera-scm user can access the AD and recreate kerberos principals.
Maybe there is a extra step when moving CM to another host if the cluster is using kerberos?
10-06-2017 03:52 PM
The error you are getting is regarding "kadmin" which is an MIT Kerberos client command for the MIT KDC.
However, you mention "cloudera-scm user can access the AD and recreate kerberos principals."
If you are using Active Directory for a KDC, that means that you seem to have a misconfiguration where your Kerberos KDC is set to "MIT"
In Cloudera Manager, go to "Administration --> Settings" and click "Kerberos" on the left under the CATEGORY section.
On the right, make sure you have "KDC Type" set to "Active Directory" if you are using Active Directory for your KDC.
Save your change and try importing credentials and generating missing credentials.
10-09-2017 02:47 AM - edited 10-09-2017 02:49 AM
Thanks for a quick response.
I already had this setting set to "Active Directory" since I exported and then imported the cloudera manager configuration.
However when I tried again today I was able to generate missing credentials and the error I got last time was gone.
The two things I did differently from the previous time I've tried that could have influenced something are the following:
1. Did reinstall of "yum reinstall openldap-clients -y" on all nodes.
2. Stopped the cloudera-scm-server on the host that was running the CM services before (last time this service was running both on the new and the old CM hosts).
Note: When I brough up the new Cloudera Manager it showed all sevices as stopped even though I left them running in the old CM. Then when I tried to start them they failed with "address in use" and similar. I had to repoint all nodes to the old CM again, then shutdown the cluster from the old CM and then point all nodes to the new CM. Then the services started fine with the execption of the Sentry service for which I had to reboot the host its was running on to make it release the pids and lock files it was holding.