Support Questions

Mjudhandoyo · ‎12-17-2016

i am new in hadoop developer, now i try to research with Hbase table.

i want to try to load data from my CSV file. I have more than 10 million data from my CSV file. so i want to populate it to the Hbase table.

but i do not know how to do it. Anybody can help me ?

what is step by step to populate hbase table from my CSV file ?

thank you very much, i need somebody help ..

Harsh J · ‎12-18-2016

You are looking for the ImportTSV utility offered by HBase, and its
bulk-load option. Read up more at
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_importtsv to learn
how to use the ImportTSV to prepare bulk-loadable output, followed by
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#arch.bulk.load.complete
which
shows how to load the prepared output finally into HBase.

There's also a slightly dated example of the process on the Cloudera
Engineering blog which uses a CSV example with the above process:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/

View solution in original post

Harsh J · ‎12-18-2016

You are looking for the ImportTSV utility offered by HBase, and its
bulk-load option. Read up more at
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_importtsv to learn
how to use the ImportTSV to prepare bulk-loadable output, followed by
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#arch.bulk.load.complete
which
shows how to load the prepared output finally into HBase.

There's also a slightly dated example of the process on the Cloudera
Engineering blog which uses a CSV example with the above process:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/

Mjudhandoyo · ‎12-20-2016

so i must to download the importtsv again ? or function importtsv had been there since i download and install hbase for my cluster ?

Harsh J · ‎12-20-2016

The ImportTSV utility comes included with your CDH installation.

Mjudhandoyo · ‎12-20-2016

i try to use command

"hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,f:count wordcount word_count.csv"

and i get an error like this permission denied : user = xxxxx , access = WRITE, inode ="/user":hdfs:supergroup:drwxr-xr-x

what must i do ? .. can you help me ?

Mjudhandoyo · ‎12-21-2016

i have make it !

i can upload csv now thanks a lot for your help !

i appreciate that 😄

Mjudhandoyo · ‎12-25-2016

sorry for bother again..

i have upload it but when i try to upload it second time with same file..

my mapreduce result is success and the output file is exist.

but my database still empty , i don't know why becuase there is no error in the log.

i only change the name of the output folder, only that..

the first time, my output file is " output " and the next i change my output file to "output2",etc

do you know why ?

i don't know why this happen.. thank you very much

Zie · ‎01-08-2018

Hallo, can you help me?
I have a problem when importing data into hbase table. I've tried to use importtsv, but the problem is the number of columns in my file very much (1000 columns). Do I have to write all the columns or is there another way that can automatically increase the number of columns according to the file?

Thankyou..

Cloudera Community

Support Questions

Import / Load data from CSV to Hbase

Cache Aware Load Balancer in Apache HBase

Export HBase data to csv

NULL columns importing csv data into table

How do I import data from csv file into Hbase?

Ingest Geospatial data (ais data in CSV format) by...

Streamlining Data Processing with Spark HBase Inte...

Sqoop import data from hive to csv.

How to process corrupted CSV data with NiFi

Importing Tables from relational database to HBase...

Tactical modularity in CDE Airflow by loading code...