After moving the Solr data directory I am no longer able to start Atlas, or audit to Solr. Also the Solr UI "loads" but the content is never actually displayed, just a spinning wheel.
The cluster is Kerberized and SSL is enabled on Ranger / Atlas / HDFS / Ambari / YARN. I have another cluster with identical settings working fine.
I have tried deleting the infra solr znode but it did not solve the issue. I'm seeing "no key to store" and "request is a replay" in the atlas logs. I am most likely going to remove ambari infra and re install but I would like to try this last.
If we can know, what do you mean by 'moving Solr data directory'?
Where/why did you move? What is the current Solr data directory? What about the permission on the new Solr data directory.
I would look for any possible error in Solr log file and would start digging from there. If there are no error in Solr log, then enabling debug level logging might help.
If Atlas is saying "no keyt to store", then it is definitely missing some vital info in Solr. In this case, reinstalling Solr might not help but reinstalling Atlas (which will force Atlas to recreate Solr data) might do the trick.
Hope this helps.
It was not moved by me, however it was moved to a different disk for space reasons in the future. He mistakenly removed the old files, not realizing. It was from /opt/solr to another similar location on a different disk.
The permissions are the same, and solr user is able to access the files fine.
The Solr log is not showing much in the way of errors, even with debugging on.
When you say Atlas is missing some vital information in solr, what information are you referring to? Is " no key to store" not a kerberos issue? Also combined with the "request is a replay" error I'm lead to believe kerberos is the issue.
What risks do I run when I remove Solr / Atlas from a cluster that's secured? The cluster doesn't have any data on it at this moment so loss of data is not a huge concern. Mainly concerned with configuration mismatches with installing and reinstalling services.
Thanks for all of your help
The "request is a replay" is definitely a Kerberos error. But "no key to store" may or may not be a Kerberos error, it depends on the context. If you can show us the full error stack, that will help us understand this issue better.