Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I am searching for efficient ways of moving the Pig output on HDFS to the HBase tables??

Highlighted

I am searching for efficient ways of moving the Pig output on HDFS to the HBase tables??

New Contributor
 
2 REPLIES 2

Re: I am searching for efficient ways of moving the Pig output on HDFS to the HBase tables??

These 3 options may work for you.

1. Write a mapreduce job manually and use hbase bulkload api (this should work fast but need little bit time to write code).

2. Write a pig script and use pig native Hbasestorage to load your data from hdfs to hbase.

3. Create a hive external table pointing to pig output directory and then create another hive table that should point to hbase table, once you make these tables then you can directly use insert into hive-hbase-table select * from hive-external table( but make sure your both table structure is same).

Re: I am searching for efficient ways of moving the Pig output on HDFS to the HBase tables??

So the output is delimited? why not just use the bulkload API? I don't think there is a need to write a mapreduce job.

http://hbase.apache.org/0.94/book/ops_mgt.html#importtsv

Don't have an account?
Coming from Hortonworks? Activate your account here