- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Import / Load data from CSV to Hbase
- Labels:
-
Apache HBase
-
Apache Zookeeper
-
HDFS
Created on ‎12-17-2016 11:58 PM - edited ‎09-16-2022 03:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i am new in hadoop developer, now i try to research with Hbase table.
i want to try to load data from my CSV file. I have more than 10 million data from my CSV file. so i want to populate it to the Hbase table.
but i do not know how to do it. Anybody can help me ?
what is step by step to populate hbase table from my CSV file ?
thank you very much, i need somebody help ..
Created ‎12-18-2016 12:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bulk-load option. Read up more at
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_importtsv to learn
how to use the ImportTSV to prepare bulk-loadable output, followed by
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#arch.bulk.load.complete
which
shows how to load the prepared output finally into HBase.
There's also a slightly dated example of the process on the Cloudera
Engineering blog which uses a CSV example with the above process:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
Created ‎12-18-2016 12:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bulk-load option. Read up more at
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_importtsv to learn
how to use the ImportTSV to prepare bulk-loadable output, followed by
http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#arch.bulk.load.complete
which
shows how to load the prepared output finally into HBase.
There's also a slightly dated example of the process on the Cloudera
Engineering blog which uses a CSV example with the above process:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
Created ‎12-20-2016 05:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so i must to download the importtsv again ? or function importtsv had been there since i download and install hbase for my cluster ?
Created ‎12-20-2016 05:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎12-20-2016 11:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i try to use command
"hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,f:count wordcount word_count.csv"
and i get an error like this permission denied : user = xxxxx , access = WRITE, inode ="/user":hdfs:supergroup:drwxr-xr-x
what must i do ? .. can you help me ?
Created ‎12-21-2016 01:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i have make it !
i can upload csv now thanks a lot for your help !
i appreciate that 😄
Created ‎12-25-2016 02:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sorry for bother again..
i have upload it but when i try to upload it second time with same file..
my mapreduce result is success and the output file is exist.
but my database still empty , i don't know why becuase there is no error in the log.
i only change the name of the output folder, only that..
the first time, my output file is " output " and the next i change my output file to "output2",etc
do you know why ?
i don't know why this happen.. thank you very much
Created ‎01-08-2018 01:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a problem when importing data into hbase table. I've tried to use importtsv, but the problem is the number of columns in my file very much (1000 columns). Do I have to write all the columns or is there another way that can automatically increase the number of columns according to the file?
Thankyou..
