04-19-2016 11:17 AM
Hi dear experts!
i'm trying to load data from CSV format on HDFS to HBase with ImportTSV (importtsv).
it works perfectly fine in case when HBASE_ROW_KEY is the single CSV column.
but i don't know how to create composite HBASE_ROW_KEY (from two columns).
for example, i have CSV with 3 columns:
row1, 1, abc row1, 2, dd row2, 1, iop row3, 1, kk
and row could be uniqly identified by first two columns.
any inputs will be highly appreciated!
Solved! Go to Solution.
04-24-2016 12:06 PM
04-24-2016 12:08 PM
04-29-2016 04:30 PM
11-02-2017 03:24 AM
Hi! Sorry for digging out this thread, but I am currently facing the same problem and decided to run MR job to transform my data before importing it.
However, I am unsure what the data output should look like for it to be understood by HBase. As far as I know, HBase saves everything as bytes anyway, but makes a difference for timestamps. So,say I want to queue Factory_ID:YYYMMDD:Order_ID:UID for my composite key. Should I output them with ":" as a separator. Or just one after another? Will HBase be able to use this information to shard the table into different regions?
Thanks in advance!
11-02-2017 04:39 AM