Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hbase ExportSnapshot copy-to localFS(NFS)

Solved Go to solution

Hbase ExportSnapshot copy-to localFS(NFS)

New Contributor

Can I export an hbase table snapshot to say a datanode server's local filesystem directly? 

i.e. org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to fs://local_linux_fs_dir

 

The goal is to mount an NFS external storage on the localFS and export the snapshots there.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Hbase ExportSnapshot copy-to localFS(NFS)

Master Guru
The ExportSnapshot is an MR job, and as a result of that it will run across your NodeManager hosts. To provide its destination as a local filesystem URI, such as your file:///local_linux_fs_dir would only work if that passed path is visible with the same consistent content across all your cluster hosts.

You can do this perhaps by mounting the same NFS across all hosts, and then using a controlled ExportSnapshot parallelism to write to them without overloading them (limit the # of maps to be low-enough).

If that's not desirable, then you can also opt to run the MR job in local mode, which would still be parallel but limitedly so, by passing -Dmapreduce.framework.name=local to ExportSnapshot before any other option.
1 REPLY 1
Highlighted

Re: Hbase ExportSnapshot copy-to localFS(NFS)

Master Guru
The ExportSnapshot is an MR job, and as a result of that it will run across your NodeManager hosts. To provide its destination as a local filesystem URI, such as your file:///local_linux_fs_dir would only work if that passed path is visible with the same consistent content across all your cluster hosts.

You can do this perhaps by mounting the same NFS across all hosts, and then using a controlled ExportSnapshot parallelism to write to them without overloading them (limit the # of maps to be low-enough).

If that's not desirable, then you can also opt to run the MR job in local mode, which would still be parallel but limitedly so, by passing -Dmapreduce.framework.name=local to ExportSnapshot before any other option.