Member since
11-02-2017
2
Posts
0
Kudos Received
0
Solutions
11-02-2017
04:40 AM
1 Kudo
You can pass an input directory to the ImportTSV tool, where your directory can carry any number of files. It is not limited to a single file unless you pass a single file (instead of a directory) to it.
... View more
11-02-2017
04:39 AM
1 Kudo
You are right that its all just byte sequences to HBase, and that it sorts everything lexicographically. You do not require a separator character when composing your key for HBase to understand them as boundaries (cause it would not serve as one), unless you prefer the extra bytes for better readability or for recovering back the individual data elements from (variable length) keys if that's a use-case. HBase 'sharding' (splitting) can be manually specified at table create time if you are aware of your key pattern and ranges - this is strongly recommended to scale from the beginning. Otherwise, HBase computes key midpoints by analysing them in byte form and splits them based on that, whenever a split size threshold is reached for a region range.
... View more