Support Questions

Find answers, ask questions, and share your expertise

HBASE import/export questions

avatar
Rising Star

Some questions regarding import/export in HBASE :

I used below command to export a table :

/usr/bin/hbase org.apache.hadoop.hbase.mapreduce.Export emp /tmp/emp

But could not find the output anywhere on the host - but finally found it within hdfs. What is the way to export to file system directory?

Will import work only if the table is empty of data?

So truncate table (to delete all data in table) needs to precede import command?

Or is there an overwrite option in import (to overwrite existing data in table)?

Is it possible to append data with import ie table already has some data and we want to add more data with import.

Also is there any way to extract the table schemas including 'create table' from hbase - or is 'describe <table>' the only way?

Appreciate the insights.

3 REPLIES 3

avatar
Super Guru

"What is the way to export to file system directory?"

You cannot do this. The MapReduce job can only write to HDFS as it is running across many nodes. Use the HDFS cli to copy the files to the local FS if you have this requirement.

"Will import work only if the table is empty of data?"

No, the import job does not require the table to be empty.

"Is it possible to append data with import ie table already has some data and we want to add more data with import."

This is essentially how the Import job works. The original timestamp on the exported data is preserved. So, if your destination table has a Key with a newer timestamp, you would not see the older data after import.

"Also is there any way to extract the table schemas including 'create table' from hbase - or is 'describe <table>' the only way?"

Describe is the only way. However, you may want to consider using HBase Snapshots instead of these Export/Import mapreduce jobs. Snapshots implicitly hold onto the schema, but, upon restore, would re-set the table to the exact state (as opposed to Import's "merge")

avatar
Rising Star

I have a hbase table with 3 rows. I export the table.

Then I delete 1 row.

Then I import from the exported data from step 1.

There are still only 2 rows in the table.

???

avatar
Super Guru

You're probably running in the leftover tombstone from a delete: https://hbase.apache.org/book.html#_delete

Compact your table and then run the import.