04-07-2016 01:40 PM - edited 04-07-2016 02:07 PM
I have Solr instance which is installed on a Master/Name Node which runs fine. I subsequently installed another Solr instance on a Slave/Data Node. It starts fine but then minuted later fails with:
Solr Server API Liveness
The Cloudera Manager Agent is not able to communicate with this Solr Server over the HTTP API.
Web Server Status
The Cloudera Manager Agent is not able to communicate with this role's web server.
I found the following in the cloudera-scm-agent.log file:
HTTPError: HTTP Error 401: Unauthorized
[07/Apr/2016 21:59:20 +0000] 1515 Monitor-SolrServerMonitor urllib2_kerberos CRITICAL GSSAPI Error: Unspecified GSS failure. Minor code may provide more information/Server krbtgt/DOMAIN.COM@HADOOPSS.DOMAIN.COM not found in Kerberos database
[07/Apr/2016 21:59:20 +0000] 1515 Monitor-SolrServerMonitor url_util ERROR Autentication error on attempt 2. Retrying after sleeping 1.000000 seconds.
04-08-2016 07:05 AM
I have fixed this by deleting the original principals from the MIT KDC and Generate Missing Credentials from the Administration->Security->Kerberos Credentials page. Add new Solr Instances as required and this will create a new Principal and Keytab as needed.
When the instances were created as part of original Cluster build the principals were invalid. They may been left from a previous build or created in invalid state. Deleting them from the KDC seems to fix this.
04-08-2016 07:24 AM
Thanks for sharing your solution @shaileshCG, hopefully it will be of assistance to others in the future.