Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Region Split On Phoenix Table

Region Split On Phoenix Table

New Contributor

Hi @gsharma 

First of all sharing the post on Phoenix & Hbase.I must admit it was extremely helpful for me to understand some of the internals of these technologies. 

I am creating a phoenix table which has almost 100+ columns with composite primary key (ID+DAY).

Sample ID-700000101 

DAY- 20200101

However I am not sure whether to use pre-split or SALT_BUCKET.

I have 1 Master (8 v-core/16G)+6 (8 v-core/16G) Data Node.

Since I need to aggregate this table based on all columns.

 

create table (

ID varchar

DAY varchar

....

....

....

CONSTRAINT pk PRIMARY KEY (ID,DAY))

COMPRESSION='SNAPPY', BLOOMFILTER='ROW' SPLIT ON ('7000000101','7000500101','7000060101','7000080101','7000900101','7000000101');

 

 

select ID,

SUM(col1)

SUM(col2)

SUM(col3)

....

....

....

....

SUM(col 110)

from TABLE where DAY>=202000101 AND DAY<202000101 group by ID

 

However the above aggregation is taking too long even for 1 day worth of date takes long and is not consistent.

 

So I am not sure what is-

ideal way of creating the phoenix table

do i need to use index to fetch faster, within few seconds. If yes which one I should opt?based on ID or DAY

 

CREATE INDEX {TABLE_INDX_NAME} ON {TABLE_NAME} (DAY) SPLIT ON ('7000000101','7000500101','7000060101','7000080101','7000900101','7000000101')

 

OR BY DAY

 

CREATE INDEX {TABLE_INDX_NAME} ON {TABLE_NAME} (DAY) SPLIT ON ('20200101','20200102','20200103','20200104','20200105','20200106')

 

Regards

Rajiv

Don't have an account?
Coming from Hortonworks? Activate your account here