Created on 04-19-2016 11:17 AM - edited 09-16-2022 03:14 AM
Hi dear experts!
i'm trying to load data from CSV format on HDFS to HBase with ImportTSV (importtsv).
it works perfectly fine in case when HBASE_ROW_KEY is the single CSV column.
but i don't know how to create composite HBASE_ROW_KEY (from two columns).
for example, i have CSV with 3 columns:
row1, 1, abc row1, 2, dd row2, 1, iop row3, 1, kk
and row could be uniqly identified by first two columns.
any inputs will be highly appreciated!
Created 04-29-2016 04:30 PM
Created 04-24-2016 12:06 PM
Created 04-24-2016 12:08 PM
Created 04-29-2016 04:30 PM
Created 11-02-2017 03:24 AM
Hi! Sorry for digging out this thread, but I am currently facing the same problem and decided to run MR job to transform my data before importing it.
However, I am unsure what the data output should look like for it to be understood by HBase. As far as I know, HBase saves everything as bytes anyway, but makes a difference for timestamps. So,say I want to queue Factory_ID:YYYMMDD:Order_ID:UID for my composite key. Should I output them with ":" as a separator. Or just one after another? Will HBase be able to use this information to shard the table into different regions?
Thanks in advance!
Created 11-02-2017 04:39 AM