Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Export HBase data to csv

avatar
Master Guru

How to export hbase data to csv? Table or entire database (table by table). I have always used/built a map/reduce job to do this. However, I understand apache Pherf has these capabilities. I have also used phoenix to create csv:

 1. !outputformat csv
>>>> 2. !record data.csv
>>>> 3. select * from mytable;
>>>> 4. !record
>>>> 5. !quit

I have also used hbase export table which create a hadoop sequence file on a target hdfs directory. I basically create a hive table on top of this sequence file and select * into another table which uses csv storage/file format. This requires a few steps and not too complicated.

How else folks? I am looking for the "easy" button here.

1 ACCEPTED SOLUTION

avatar
Master Mentor
8 REPLIES 8

avatar
Master Guru

You can create a Hive external table mapped onto your HBase table using HBaseStorageHandler, see the example at the end of the Usage section, and then, as what you did with your Sequence file, "select *" from this table into a csv table (stored as textfile fields terminted by ',').

avatar

I am getting mapreduce error. It starts, but fails within a few minutes. do you have a working example?

avatar
Master Guru

You can also use HDF or Spark if you need to do some interesting things with it

avatar
Contributor

@Sunile Manjee Try the Export utility tool that comes as part of HBase, that exports it into hdfs. Try something like following

bin/hbase org.apache.hadoop.hbase.mapreduce.Export table_name file:///tmp/db_dump/

It can also be done using happybase library. Here's an example

https://gist.github.com/srs81/5504396

avatar
Master Mentor

avatar
Super Collaborator

Hi @Sunile Manjee,

I am following a hbase export table technique. I did an export, created a hive table stored as sequencefile but if I am loading the sequence file data into Hive table, its giving me the error:

java.lang.RuntimeException: java.io.IOException: WritableName can't load class: org.apache.hadoop.hbase.io.ImmutableBytesWritable

It would be really helpful if you let me know the solution. Thanks.

avatar
Master Guru

@mrizvi

Do you mind opening a seperate HCC post on your question?

avatar
Super Collaborator

Sure @Sunile Manjee, let me do it.