Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Phoenix indexes are not updating

Highlighted

Phoenix indexes are not updating

New Contributor

-- CREATE TABLE --

create table IF NOT EXISTS ph3_agg1 (
DATET varchar(19),
T1 DATE NOT NULL,
V1 varchar(10) NOT NULL,
o1 integer NOT NULL,
d1 varchar NOT NULL,
i1 varchar(15) NOT NULL,
minttl integer NOT NULL,
maxttl integer NOT NULL,
CONSTRAINT pk PRIMARY KEY(T1, V1, o1, d1, i1, minttl, maxttl))
IMMUTABLE_ROWS=true,
DATA_BLOCK_ENCODING='FAST_DIFF',
SALT_BUCKETS = 3;

-- CREATE INDEXES --
create index idx_ph3_i1 on ph3_agg1(i1);
create index idx_ph3_d1 on ph3_agg1(d1);

-- LOAD data --

HADOOP_CLASSPATH=/home/hdp/3.0.0.0-1634/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /home/hdp/3.0.0.0-1634/phoenix/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -g --table ph3_agg1 --input /tmp/load/testcsvfile/test11.csv

-- DATA WAS LOADED, BUT THE INDEXES DID GET BUILD --

re-created the table and re-create the indexes with the ASYNC INDEX option,

-- CREATE INDEXES --
create index idx_ph3_i1 on ph3_agg1(i1) ASYNC;
create index idx_ph3_d1 on ph3_agg1(d1) ASYNC;

-- Load DATA --

HADOOP_CLASSPATH=/home/hdp/3.0.0.0-1634/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /home/hdp/3.0.0.0-1634/phoenix/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -g --table ph3_agg1 --input /tmp/load/testcsvfile/test11.csv

NO PROBLEM, every seems to work.

-- BUILD INDEXES --

HADOOP_CLASSPATH=/home/hdp/3.0.0.0-1634/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /home/hdp/3.0.0.0-1634/phoenix/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.index.IndexTool --data-table PH4_AGG1 --index-table IDX_PH4_D1 --output-path = IDX_PH4_D1_files

Then same command build the i1 indexes. ALL WORKED OK!!

-- Insert more DATA --

HADOOP_CLASSPATH=/home/hdp/3.0.0.0-1634/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /home/hdp/3.0.0.0-1634/phoenix/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -g --table ph4_agg1 --input /tmp/load/testcsvfile/test12.csv

The indexes are not updated, if I run index.indexTool again, it will create another copy of the index on top of the existing index.

19.4 K 38.8 K /apps/hbase/data/data/default/IDX_PH4_D1
18.8 K 37.6 K /apps/hbase/data/data/default/IDX_PH4_I1
37.1 K 74.3 K /apps/hbase/data/data/default/PH4_AGG1

Is there an update option for index.indexTool? I know you suppose to put index.indexTool in cron, but whenever I run it, the indexes become twice the size of the previous indexes.

-- Tried partial-rebuild, but it said nothing todo --

HADOOP_CLASSPATH=/home/hdp/3.0.0.0-1634/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /home/hdp/3.0.0.0-1634/phoenix/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.index.IndexTool --data-table PH4_AGG1 --partial-rebuild

So, my question is if my indexes are ASYNC, how do I update indexes as I load more data into the table?