Support Questions
Find answers, ask questions, and share your expertise

HBase backup - offline, standalone mode (no HDFS)

We are looking for the best way to backup/restore our HBase (Phoenix) databases in some of our development environments. These environments are running a standalone install of HBase, so no HDFS, writes go to the filesystem.

I have looked through https://community.hortonworks.com/questions/6584/hbase-table-dump-to-flat-files.html and other documents, and most comments refer to HDFS commands, which obviously don't apply in this case.

Can we just zip up the data directory? What about the metadata? We want to be able to "export" the database, and restore it to another environment (or over the existing one) at some point in the future. Having a portable artifact like a RDBMS backup would be ideal.

1 ACCEPTED SOLUTION

Accepted Solutions

Just stop HBase and copy the contents of your hbase.rootdir directory (in hdfs version it is /apps/hbase/data, in your case it's somewhere on your file system). That's going to be your backup artifact. Then, as a test try to restore it to another environment and make sure it works by listing and scanning some tables. All HBase metadata are included there, and it should work as-is.

View solution in original post

7 REPLIES 7

In our case, it is acceptable for HBase to be stopped before the "backup", so we can be sure to be consistent.

Super Collaborator

For database backup / restore, Please take a look at HBASE-7912

The git branch is HBASE-7912

FYI

@Jason Knaster I don't think we have any direct method available with current HDP release. Please see HBASE-7912 as Ted mentioned.

Just stop HBase and copy the contents of your hbase.rootdir directory (in hdfs version it is /apps/hbase/data, in your case it's somewhere on your file system). That's going to be your backup artifact. Then, as a test try to restore it to another environment and make sure it works by listing and scanning some tables. All HBase metadata are included there, and it should work as-is.

View solution in original post

Thanks, I will test that out in the next day or two - I think this is what I was looking for (confirmation that all the metadata I need will be in the data directory).

Yes, metadata is in stored in special HBase tables in the 'hbase' namespace. In latest versions of HBase you can inspect them by opening hbase shell and running, for example

list_namespace_tables 'hbase'
scan 'hbase:meta'

This approach won't work in distributed HBase, and some changes in "meta" would be required. Also in your case it might be necessary to restore the backup directory on the same path in the target system, but most likely not. Just give it a try.

@Jason Knaster

I just tested this scenario on Hbase 1.1.2 and it worked.

[root@ey ~]# cat /usr/hdp/current/hbase-master/conf/hbase-site.xml

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///test/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/test/zookeeper</value>
  </property>
</configuration>
root@ey hbase-master]# ./bin/start-hbase.sh
[root@ey hbase-master]# hbase shell

hbase(main):001:0> create 't1','c1'
0 row(s) in 1.4790 seconds
=> Hbase::Table - t1
hbase(main):002:0> put 't1','123','c1:id','123m'
0 row(s) in 0.1500 seconds
hbase(main):003:0> scan 't1'
ROW                                                  COLUMN+CELL
 123                                                 column=c1:id, timestamp=1464183553280, value=123m
1 row(s) in 0.0340 seconds

Then Stopped the standlone Hbase .

Compressed and transfer the hbase dir to another node.

[root@ey hbase-master]# tar -cvf  test.tar /test
[root@ey ~]# scp test.tar root@AD:/root/

On Node "AD"

[root@AD ~]# tar xvf test.tar
[root@AD ~]# mv /root/test /

Copied same hbase-site.xml from primary hbase.

[root@AD hbase-master]# ./bin/start-hbase.sh

root@AD hbase-master]# hbase shell
hbase(main):001:0> list
TABLE
t1
1 row(s) in 0.2860 seconds
=> ["t1"]
hbase(main):002:0> scan 't1'
ROW                                                  COLUMN+CELL
 123                                                 column=c1:id, timestamp=1464183553280, value=123m
1 row(s) in 0.1290 seconds
hbase(main):003:0>