Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

solrctrl init --force wiped zookeeper hbase entries

avatar
Explorer

I had an 'interesting' experience setting up cloudera search as an addition to a not to shabby hbase cluster.

 

Problems started when I created a collection with a trailing '/ ' , which is not allowed apparently. In hindsight I now know that this created a item in the overseer queue, which could not be processed, blocking all further requests. Showing up in the logs as the overseer being in a loop.

 

When I did not know this I tried a 'solrctl init', which did not work. After reading the warnings that this could mess up any previous solr state, which we didn't have, i continued using "solrctl init --force". I was a little surprised to see that the entire /hbase entry in zookeeper was wiped clean and all of hbase being in a state of panic, losing it's entire administration.

 

Revering back to zookeeper snapshots got my hbase back up and running, but I'm still baffled on:

1. How could this have happened?

2. If this is even a remote possibility of this command, I would recommend adding some extra red flags around the documentation recommending this option.

 

I'm running CDH4.5 with solr 1.1.

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Hi RobV. 

 

Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.

 

Re the overseer getting stuck, it sounds similar to this:

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Release-Notes/cd5rn_...

 

I'll verify that we check the collection doesn't end with a "/" character.

 

Regards,

 

Patrick

View solution in original post

4 REPLIES 4

avatar
Cloudera Employee

Hi RobV. 

 

Are you using CM to manage the cluster or non-CM? I suspect that the "--zk" option is not being passed correctly, either on the command line or as setup in the default solr config (perhaps you switched hosts?). solrctl should be managing "/solr" in ZK, however it can be passed another root. If the root is not specified it will default to "/". At which point it might try to delete "/hbase" accidentally (cleanup). I'll enter a bug report for us to look at this.

 

Re the overseer getting stuck, it sounds similar to this:

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Release-Notes/cd5rn_...

 

I'll verify that we check the collection doesn't end with a "/" character.

 

Regards,

 

Patrick

avatar
Explorer

Yes we manage the cluster with CM. Reading your reply I'm now sure the new edge node we added did not get a 'deploy client config' so was missing the proper settings. Not knowing this at the time, the solrctl did not work as expected(without the proper client configs) I remember manually adding them to the solrctl command, most likely without the required /solr root, resulting in the wipe of zookeeper /. Thanks for clearing this up.

 

Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.

 

Thanks for filing the reports,

  Rob

 

 

avatar
Cloudera Employee

Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool.

 

I understand. However you are passing the "--force" option, which is really just meant for the case where you absolutely want to force the clearance (e.g. "rm -f /*" being a classic/similar case). W/o this option we would complain and tell you that if you really want to do this you need to use --force (given it might be dangerous, etc...).

 

So the problem lies in what should we do to better handle this case.

 

Requiring your to say "solrctl --force --really" or somesuch doesn't sound like it would be reasonable.

 

Say you specify "/" in your --zk rather than "/solr" (or it's the default as in your case). One option is that we could refuse to make the change if we find znodes in "/" (in this case) that shouldn't belong. However that seems erorr prone and not a great solution (say it's not /hbase and rather something else, etc...). Say you have "/solr1" and "/solr2", the same issue would apply if you accidentally specified the wrong one with --force, etc...

 

Any suggestions/ideas?

 

I guess solrctl could prompt you interactively when you use --force:

 

"solrctl is about to reinitialize '/solr' repository, accept?"

or 

"solrctl is about to reinitialize "/" repository, accept?" in your case.

 

but then non-interactive use might suffer (but then add a "-y" option or somesuch to compensate). This is the way I'm leaning at the moment.

avatar
Explorer

I'm wondering if there ever is a reason for solr to be in the root of a zookeeper install. Shouldn't it always be in some path inside '/'? In that case --zk being '/' would indicate a problem, either in configuration or in the user making a mistake, something you could alert on or even refuse to run.

 

Adding the prompt on --force would be a great step and I see the use of the -y option.