Created 04-20-2016 02:21 PM
Hi guys
I just setup Phoenix 4.5.2-1.clabs_phoenix1.2.0.p0.774 through Cloudera Manager on CDH 5.6.0. My dev cluster is
3 boxes
Each is HP 8300 8 core, 32GB RAM
1NN and 3DN
DDL (this table is created in Phoenix on HBase as well as in Hive)
====
CREATE TABLE IF NOT EXISTS resume_dates (resid VARCHAR, cd VARCHAR, uts BIGINT CONSTRAINT pk PRIMARY KEY (resid));
Sample Data
==========
14008_1_1000522248_0_1108045212,2014-01-30,1391093927
14025_1_1010236513_0_1107883638,2014-01-30,391093930
Num of records
============
23,748,651
Query
=====
select substr(cd, 1,4) as yyyy, count(resid) from RESUME_DATES group by substr(cd, 1,4) order by yyyy asc
Comparison of Timings
==================
Hive on MR = 81.829 seconds
Hive on Spark = 32.78 seconds
Phoenix = 12.234 seconds
Impala = 0.99 seconds
Thanks
sanjay
Created 09-01-2017 09:38 AM