Support Questions

Find answers, ask questions, and share your expertise

solrctrl init --force wiped zookeeper hbase entries

avatar
Explorer

I had an 'interesting' experience setting up cloudera search as an addition to a not to shabby hbase cluster.

 

Problems started when I created a collection with a trailing '/ ' , which is not allowed apparently. In hindsight I now know that this created a item in the overseer queue, which could not be processed, blocking all further requests. Showing up in the logs as the overseer being in a loop.

 

When I did not know this I tried a 'solrctl init', which did not work. After reading the warnings that this could mess up any previous solr state, which we didn't have, i continued using "solrctl init --force". I was a little surprised to see that the entire /hbase entry in zookeeper was wiped clean and all of hbase being in a state of panic, losing it's entire administration.

 

Revering back to zookeeper snapshots got my hbase back up and running, but I'm still baffled on:

1. How could this have happened?

2. If this is even a remote possibility of this command, I would recommend adding some extra red flags around the documentation recommending this option.

 

I'm running CDH4.5 with solr 1.1.

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Hi RobV. 

 

Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.

 

Re the overseer getting stuck, it sounds similar to this:

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Release-Notes/cd5rn_...

 

I'll verify that we check the collection doesn't end with a "/" character.

 

Regards,

 

Patrick

View solution in original post

4 REPLIES 4

avatar
Cloudera Employee

Hi RobV. 

 

Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.

 

Re the overseer getting stuck, it sounds similar to this:

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Release-Notes/cd5rn_...

 

I'll verify that we check the collection doesn't end with a "/" character.

 

Regards,

 

Patrick

avatar
Explorer

Yes we manage the cluster with CM. Reading your reply I'm now sure the new edge node we added did not get a 'deploy client config' so was missing the proper settings. Not knowing this at the time, the solrctl did not work as expected(without the proper client configs) I remember manually adding them to the solrctl command, most likely without the required /solr root, resulting in the wipe of zookeeper /. Thanks for clearing this up.

 

Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.

 

Thanks for filing the reports,

  Rob

 

 

avatar
Cloudera Employee

Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.

 

I understand. However you are passing the "--force" option, which is really just meant for the case where you absolutely want to force the clearance (e.g. "rm -f /*" being a classic/similar case). W/o this option we would complain and tell you that if you really want to do this you need to use --force (given it might be dangerous, etc...).

 

So the problem lies in what should we do to better handle this case.

 

Requiring your to say "solrctl --force --really" or somesuch doesn't sound like it would be reasonable.

 

Say you specify "/" in your --zk rather than "/solr" (or it's the default as in your case). One option is that we could refuse to make the change if we find znodes in "/" (in this case) that shouldn't belong. However that seems erorr prone and not a great solution (say it's not /hbase and rather something else, etc...). Say you have "/solr1" and "/solr2", the same issue would apply if you accidentally specified the wrong one with --force, etc...

 

Any suggestions/ideas?

 

I guess solrctl could prompt you interactively when you use --force:

 

"solrctl is about to reinitialize '/solr' repository, accept?"

or 

"solrctl is about to reinitialize "/" repository, accept?" in your case.

 

but then non-interactive use might suffer (but then add a "-y" option or somesuch to compensate). This is the way I'm leaning at the moment.

avatar
Explorer

I'm wondering if there ever is a reason for solr to be in the root of a zookeeper install. Shouldn't it always be in some path inside '/'? In that case --zk being '/' would indicate a problem, either in configuration or in the user making a mistake, something you could alert on or even refuse to run.

 

Adding the prompt on --force would be a great step and I see the use of the -y option.