Community Articles

nshawa · ‎11-30-2015

One of the first cases we get to see with Hbase is loading it up with Data, most of the time we will have some sort of data in some format like CSV availalble and we would like to load it in Hbase, lets take a quick look on how does the procedure looks like:

lets examine our example data by looking at the simple structure that I have got for an industrial sensor

 id, temp:in,temp:out,vibration,pressure:in,pressure:out
 5842,  50,     30,       4,      240,         340

First of all make sure Hbase is started on your Sandbox as following

Creating the HBase Table

Login as Root to the HDP Sandbox and and switch to the Hbase User

root> su - hbase

Go to the Hbase Shell by typing

hbase> hbase shell

Create the example table by typing

hbase(main):001:0> create 'sensor','temp','vibration','pressure'

lets make sure the table was created and examine the structure by typing

hbase(main):001:0> list

now, exit the shell by typing 'exit' and lets load some data

Loading the Data

lets put the hbase.csv file in HDFS, you may SCP it first to the cluster by using the following command

macbook-ned> scp hbase.csv root@sandbox.hortonworks.com:/home/hbase

now put in HDFS using the following command

hbase> hadoop dfs -copyFromLocal hbase.csv /tmp

we shall now execute the Loadtsv statement as following

hbase> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=,  -Dimporttsv.columns="HBASE_ROW_KEY,id,temp:in,temp:out,vibration,pressure:in,pressure:out" sensor hdfs://sandbox.hortonworks.com:/tmp/hbase.csv

once the mapreduce job is completed, return back to hbase shell and execute

hbase(main):001:0> scan sensor

you should now see the data in the table

Remarks

Importtsv statement generates massive amount of logs, so make sure you have enough space in /var/logs, its always better to have it mounted on a seperate directories in real cluster to avoid operational stop becuase of logs filling the partition.

shree20in · ‎06-16-2016

Hi,

I am using Apache Hbase (Version 1.1.3)

I used the same importtsv syntax to import the same data (header removed) - got this error -

syntax error, unexpected ','

then removed the 'ID' column in '-Dimporttsv.columns'

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator= ',' -Dimporttsv.columns="HBASE_ROW_KEY,temp:in,temp:out,vibration,pressure:in,pressure:out" sensor /user/hbase.csv

getting

syntax error, unexpected tIDENTIFIER

Please help

Thanks,

Sridharan

sonawane_m · ‎12-12-2016

Hi Ned,

I have this following csv file with me- userId,prodId,rating,Date:M,Date:D,Date:Y,Help:a,Help:b,Review:a,Review:b AO94DHGC771SJ,528881469,5,6,2,2013,0,0,We got this GPS for my husband who is an (OTR) ove,, AMO214LNFCEI4,528881469,1,11,25,2010,12,15,I'm a professional OTR truck driver and I bought ,, A3N7T0DY83Y4IG,528881469,3,9,9,2010,43,45,Well what can I say. I've had this unit in my tr,, I created a table in hbase with following query - create 'Producttest1','userId','prodId','rating','Date','Help','Review' when I am trying to import csv to hbase using query gives the following error-

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=, -Dimporttsv.columns="HBASE_ROW_KEY,userId,prodId,rating,Date:M,Date:D,Date,Y,Help:a,Help:b,Review:a,Review:b" Producttest1 hdfs://localhost:50070:/mayur/ProductReview/InputFiles/ElectronicsShortTemp.csv SyntaxError: (hbase):19: syntax error, unexpected ','

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=, -Dimporttsv.columns="HBASE_ROW_KEY,userId,prodId,rating,Date:M,Date:D,Date,Y,Help:a,Help:b,Review:a,Review:b" Producttest1 hdfs://localhost:50070:/mayur/ProductReview/InputFiles/ElectronicsShortTemp.csv

Kindly help me for this. Thanks.

b_arshad · ‎02-16-2017

@Ned Shawa I tried to follow the example above to import a csv file named drivers.

hbase(main):001:0> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=, –Dimporttsv.columns=”HBASE_ROW_KEY,driver_id,driver_name,certified,wage_plan” drivers /home/bilal/drivers.csv
SyntaxError: (hbase):1: syntax error, unexpected ','
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=, 
                                                                        ^

I am getting the following SyntaxError unexpected ','.

Would you be kind enough to suggest the solution?

Thanking you in anticipation.

priyansh_saxena · ‎05-22-2017

I am unable to load my data in HBase with this command, I get the following error

SyntaxError: (hbase):2: syntax error, unexpected tIDENTIFIER

Mine is a fully distributed cluster of Hadoop 2.7.3 and HBase 1.2.5. I have also tried removing the separator argument and loading a TSV file (the ',' given in the above line as the value of the argument separator gives an error anyway). It has probably got something to do with the way the tables are referenced in HBase 1.2.5. Please respond.

sa · ‎10-02-2019

This is not an hbase shell command, we just need to run as a command from Unix (or Windows) shell

/usr/bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=','  -Dimporttsv.columns="HBASE_ROW_KEY,value" spark-defaults hdfs:///tmp/spark-defaults.prop

Cloudera Community

Community Articles

Import CSV data into HBase using importtsv

Apache HBase

Re: Import CSV data into HBase using importtsv

Re: Import CSV data into HBase using importtsv

Re: Import CSV data into HBase using importtsv

Re: Import CSV data into HBase using importtsv

Re: Import CSV data into HBase using importtsv