Created 04-05-2018 07:36 PM
on this forum I saw a following note :
HBase key itself (If it is random enough(not to cause hotspots) than i will suggest pre-splitting without salting to get better scans)
lets say I have a table with varying integer values 14, 1333, 33, 31232 ... etc how can I presplit them in hbase or phoenix ?
Created 04-06-2018 02:55 PM
Using Phoenix:
Example:
CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SALT_BUCKETS=16
Note: Ideally for a 16 region server cluster with quad-core CPUs, choose salt buckets between 32-64 for optimal performance.
Example:
CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SPLIT ON ('CS','EU','NA')
Column family contains related data in separate files. If you query use selected columns then it make sense to group those columns together in a column family to improve read performance.
Example:
Following create table DDL will create two column faimiles A and B.
CREATE TABLE TEST (MYKEY VARCHAR NOT NULL PRIMARY KEY, A.COL1 VARCHAR, A.COL2 VARCHAR, B.COL3 VARCHAR)
Article:
https://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
Created 04-06-2018 03:30 PM
Per-split table Salting does automatic table splitting but in case you want to exactly control where table split occurs with out adding extra byte or change row key order then you can pre-split a table.
Example:
CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SPLIT ON ('CS','EU','NA')