Created on 04-03-2014 12:00 PM - edited 09-16-2022 01:56 AM
I had an 'interesting' experience setting up cloudera search as an addition to a not to shabby hbase cluster.
Problems started when I created a collection with a trailing '/ ' , which is not allowed apparently. In hindsight I now know that this created a item in the overseer queue, which could not be processed, blocking all further requests. Showing up in the logs as the overseer being in a loop.
When I did not know this I tried a 'solrctl init', which did not work. After reading the warnings that this could mess up any previous solr state, which we didn't have, i continued using "solrctl init --force". I was a little surprised to see that the entire /hbase entry in zookeeper was wiped clean and all of hbase being in a state of panic, losing it's entire administration.
Revering back to zookeeper snapshots got my hbase back up and running, but I'm still baffled on:
1. How could this have happened?
2. If this is even a remote possibility of this command, I would recommend adding some extra red flags around the documentation recommending this option.
I'm running CDH4.5 with solr 1.1.
Created 04-13-2014 10:21 PM
Hi RobV.
Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.
Re the overseer getting stuck, it sounds similar to this:
I'll verify that we check the collection doesn't end with a "/" character.
Regards,
Patrick
Created 04-13-2014 10:21 PM
Hi RobV.
Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.
Re the overseer getting stuck, it sounds similar to this:
I'll verify that we check the collection doesn't end with a "/" character.
Regards,
Patrick
Created 04-16-2014 10:35 AM
Yes we manage the cluster with CM. Reading your reply I'm now sure the new edge node we added did not get a 'deploy client config' so was missing the proper settings. Not knowing this at the time, the solrctl did not work as expected(without the proper client configs) I remember manually adding them to the solrctl command, most likely without the required /solr root, resulting in the wipe of zookeeper /. Thanks for clearing this up.
Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.
Thanks for filing the reports,
Rob
Created 04-16-2014 10:59 AM
> Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.
I understand. However you are passing the "--force" option, which is really just meant for the case where you absolutely want to force the clearance (e.g. "rm -f /*" being a classic/similar case). W/o this option we would complain and tell you that if you really want to do this you need to use --force (given it might be dangerous, etc...).
So the problem lies in what should we do to better handle this case.
Requiring your to say "solrctl --force --really" or somesuch doesn't sound like it would be reasonable.
Say you specify "/" in your --zk rather than "/solr" (or it's the default as in your case). One option is that we could refuse to make the change if we find znodes in "/" (in this case) that shouldn't belong. However that seems erorr prone and not a great solution (say it's not /hbase and rather something else, etc...). Say you have "/solr1" and "/solr2", the same issue would apply if you accidentally specified the wrong one with --force, etc...
Any suggestions/ideas?
I guess solrctl could prompt you interactively when you use --force:
"solrctl is about to reinitialize '/solr' repository, accept?"
or
"solrctl is about to reinitialize "/" repository, accept?" in your case.
but then non-interactive use might suffer (but then add a "-y" option or somesuch to compensate). This is the way I'm leaning at the moment.
Created 04-17-2014 12:05 PM
I'm wondering if there ever is a reason for solr to be in the root of a zookeeper install. Shouldn't it always be in some path inside '/'? In that case --zk being '/' would indicate a problem, either in configuration or in the user making a mistake, something you could alert on or even refuse to run.
Adding the prompt on --force would be a great step and I see the use of the -y option.