09-28-2015 01:59 PM
I have done 4 different upgrades (on 4 different clusters) and I get this error everytime. I have to wipe out /hbase and lose the data which is the exact opposite reason on why I am doing an upgrade.
There must be a step missing from the upgrade instructions, I have followed them each time.
I tried your solution here, and it doesn't work.
When I try to restart HBase the master failes and I get this:
Failed to become active master
org.apache.hadoop.hbase.util.FileSystemVersionException: HBase file layout needs to be upgraded. You have version null and I want version 8. Consult http://hbase.apache.org/book.html for further information about upgrading HBase. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'.
I run hbase hdck -fixVersionFile and it gets stuck on
5/09/28 20:51:58 INFO client.RpcRetryingCaller: Call exception, tries=14, retries=35, started=128696 ms ago, cancelled=false, msg=
09-29-2015 10:14 AM
"If you delete the /hbase directory in zookeeper, you might be able to keep the data."
Thanks for response. I am not sure how I delete this just in zookeeper. Is there a command for that?
09-29-2015 11:04 AM
You have to use the command line.
Should be something like this:
#Start the command line and connect to any of the zk servers
# if you are not using CDH then the command is zk-cli.sh
#if your clluster is kerberized you need to kinit before, otherwise the delete will fail
zookeeper-client -server localhost:2181
#Once in the shell run this to delete the directory with metadata
04-20-2016 11:53 AM
Hey everyone, this is a great thread and I might be showing my "HBase age" here with old advice, but unless something has changed in recent versions of HBase, you cannot use these steps if you are using HBase replication.
The replication counter which stores the progress of your synchronization between clusters is stored as a znode under /hbase/replication in Zookkeeper, so you'll completely blow away your replication if you do an "rmr /hbase".
Please be super careful with these instructions. And to answer @Amanda 's question in this thread about why this happens with each upgrade, this RIT problem usually appears if HBase was not cleanly shut down. Maybe you're trying to upgrade or move things around while HBase is still running?
04-20-2016 12:23 PM
Well, it's been a couple years since I supported HBase, but what we used to do is delete all the znodes in the /hbase directory in ZK EXCEPT for the /hbase/replication dir. You just have to be a little more surgical with what you're deleting in that RIT situation, IF you're using the Hbase replication feature to back your cluster up to a secondary cluster. If not, the previous advice is fine.
Ultimately, regions should not get stuck in transition, though. What version of HBase are you running? We used to have tons of bugs in older versions that would cause this situation, but those should be resolved long ago.
04-20-2016 12:26 PM
Nowadays there is a "clean" operation in the shell admin utilities that can be used to remove data files, zk data or both.
I guess that tool has in consideration what you are pointing out
09-14-2017 02:54 PM
When you run OfflineMetaRepair, most likely you will run it from your userid or root. Then we may get some opaque errors like "java.lang.AbstractMethodError: org.apache.hadoop.hbase.ipc.RpcScheduler.getWriteQueueLength()".
If you check in HDFS, you may see that the meta directory is no longer owned by hbase:
$ hdfs dfs -ls /hbase/data/hbase/ Found 2 items drwxr-xr-x - root hbase 0 2017-09-12 13:58 /hbase/data/hbase/meta drwxr-xr-x - hbase hbase 0 2016-06-15 15:02 /hbase/data/hbase/namespace
Manually chown -R it and restart HBase fixed it for me.