Hi I'm using apache phoenix to query sql over hbase.
Table schema
CREATE TABLE TABLE_1 (
SF_ID VARCHAR NOT NULL,
ENTITY_ID VARCHAR NOT NULL,
PRODUCT_SKU VARCHAR,
CITY_NAME VARCHAR,
SCREEN_NAME VARCHAR,
PRODUCT_LIST_VIEWS BIGINT,
PRODUCT_LIST_CLICKS BIGINT,
PRODUCT_LIST_CTR FLOAT,
TIMESTAMP BIGINT NOT NULL,
POS INTEGER NOT NULL,
CONSTRAINT pk PRIMARY KEY (SF_ID, ENTITY_ID, TIMESTAMP, POS));
I have created secondary index as follows:-
CREATE INDEX GA_2 ON TABLE_1 (ENTITY_ID) INCLUDE (PRODUCT_LIST_VIEWS, PRODUCT_LIST_CLICKS, PRODUCT_LIST_CTR);
But the following query is taking around 1.5s to 2s when running on 500k rows.
select ENTITY_ID as "entityId", sum(PRODUCT_LIST_VIEWS) as "productViewSum", sum(PRODUCT_LIST_CLICKS) as "productClickSum", sum(PRODUCT_LIST_CTR) as "productCTRSum" from "TABLE_1" group by ENTITY_ID;
Explain plan is as following:-
CLIENT 1-CHUNK 0 ROWS 0 BYTES PARALLEL 1-WAY FULL SCAN OVER GA_2 SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["ENTITY_ID"]
Is there any way to optimize the response time of query?