Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Expert Contributor

Hello I am using HDP 2.5.3 and Phoenix 4.7 and it seems to me flashback / snapshot query would not work if the latest index / indexed data changes.

Below is what I have done and experienced:

-- creating the schema and table
CREATE SCHEMA IF NOT EXISTS TRADE;
DROP TABLE IF EXISTS TRADE.TRADE;
CREATE TABLE IF NOT EXISTS TRADE.TRADE (ID VARCHAR, BROKER VARCHAR CONSTRAINT pk PRIMARY KEY (ID));
CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);

-- insert some date
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('123', 'BROKER123');
1 row affected (0.085 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', 'BROKER456');
1 row affected (0.036 seconds)
0: jdbc:phoenix:> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
|  ID  |   BROKER   |
+------+------------+
| 456  | BROKER456  |
+------+------------+
1 row selected (0.107 seconds)
0: jdbc:phoenix:> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+----------------------------------------------------------------------------------------+
|                                          PLAN                                          |
+----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1,'BROKER456']  |
|     SERVER FILTER BY FIRST KEY ONLY                                                    |
+----------------------------------------------------------------------------------------+
2 rows selected (0.054 seconds)
-- logged current time
0: jdbc:phoenix:>  select cast(CURRENT_TIME() as BIGINT) from TRADE.TRADE limit 1;
+--------------------------------------------+
| TO_BIGINT(TIME '2017-03-28 14:50:38.678')  |
+--------------------------------------------+
| 1490706922000                              |
+--------------------------------------------+

0: jdbc:phoenix:sandbox:2181/h> !quit

[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- showing what the snapshot data should look like
0: jdbc:phoenix:sandbox:2181/h> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
|  ID  |   BROKER   |
+------+------------+
| 456  | BROKER456  |
+------+------------+
1 row selected (0.197 seconds)
0: jdbc:phoenix:sandbox:2181/h> -- showing what the select is using correct index
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+----------------------------------------------------------------------------------------+
|                                          PLAN                                          |
+----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1,'BROKER456']  |
|     SERVER FILTER BY FIRST KEY ONLY                                                    |
+----------------------------------------------------------------------------------------+
2 rows selected (0.044 seconds)
0: jdbc:phoenix:sandbox:2181/h> !quit



[root]$ sqlline.py 
0: jdbc:phoenix:> -- now make changes to the index and indexed data
0: jdbc:phoenix:> DROP INDEX TRADE_BROKER_IDX ON TRADE.TRADE;
2 rows affected (0.748 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', '456BROKER');
1 row affected (0.1 seconds)
0: jdbc:phoenix:> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = '456BROKER';
+------+------------+
|  ID  |   BROKER   |
+------+------------+
| 456  | 456BROKER  |
+------+------------+
1 row selected (0.123 seconds)
0: jdbc:phoenix:master.sandbox.lbg.com:2181/h> CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);
2 rows affected (5.212 seconds)

0: jdbc:phoenix:> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----------------------------------------------------------------------+
|                                 PLAN                                  |
+-----------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TRADE:TRADE  |
|     SERVER FILTER BY BROKER = 'BROKER456'                             |
+-----------------------------------------------------------------------+
2 rows selected (0.042 seconds)
0: jdbc:phoenix:> !quit



[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- I would expect snaphot to use index, but in this case it is using
0: jdbc:phoenix:sandbox:2181/h> -- latest index state - i.e. no index and full scan
0: jdbc:phoenix:sandbox:2181/h> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
|  ID  |   BROKER   |
+------+------------+
| 456  | BROKER456  |
+------+------------+
1 row selected (0.168 seconds)
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----------------------------------------------------------------------+
|                                 PLAN                                  |
+-----------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TRADE:TRADE  |
|     SERVER FILTER BY BROKER = 'BROKER456'                             |
+-----------------------------------------------------------------------+
2 rows selected (0.035 seconds)
0: jdbc:phoenix:sandbox:2181/h> !quit



[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- now recreate index and update indexed data yet again
0: jdbc:phoenix:> CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);
2 rows affected (5.23 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', 'BROKER-456');
1 row affected (0.021 seconds)
0: jdbc:phoenix:> -- at latest state correctly retrieved data and use index in explain
0: jdbc:phoenix:> SELECT * FROM TRADE.TRADE;
+------+-------------+
|  ID  |   BROKER    |
+------+-------------+
| 456  | BROKER-456  |
| 123  | BROKER123   |
+------+-------------+
2 rows selected (0.14 seconds)
0: jdbc:phoenix:> EXPLAIN SELECT * FROM TRADE.TRADE WHERE BROKER = 'BROKER-456';
+-----------------------------------------------------------------------------------------+
|                                          PLAN                                           |
+-----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [2,'BROKER-456']  |
|     SERVER FILTER BY FIRST KEY ONLY                                                     |
+-----------------------------------------------------------------------------------------+
2 rows selected (0.047 seconds)
0: jdbc:phoenix:> !quit



[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- UNEXPECTEDLY NOTHING IS RETURNED!
0: jdbc:phoenix:sandbox:2181/h> SELECT * FROM TRADE.TRADE ;
+-----+---------+
| ID  | BROKER  |
+-----+---------+
+-----+---------+
No rows selected (0.066 seconds)
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT * FROM TRADE.TRADE ;
+----------------------------------------------------------------------------+
|                                    PLAN                                    |
+----------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1]  |
|     SERVER FILTER BY FIRST KEY ONLY                                        |
+----------------------------------------------------------------------------+
2 rows selected (0.031 seconds)
0: jdbc:phoenix:sandbox:2181/h> SELECT * FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----+---------+
| ID  | BROKER  |
+-----+---------+
+-----+---------+
No rows selected (0.077 seconds)
0: jdbc:phoenix:sandbox:2181/h> -- AND IT WORKS IF FORCING PHOENIX TO DO FULL SCAN
0: jdbc:phoenix:sandbox:2181/h> SELECT /*+NO_INDEX*/ * FROM TRADE.TRADE ;
+------+------------+
|  ID  |   BROKER   |
+------+------------+
| 123  | BROKER123  |
| 456  | BROKER456  |
+------+------------+
2 rows selected (0.07 seconds)


Anyone had the same issue? How did you resolve it or is this a phoenix bug? If this is a bug is there plan to fix this in which HDP / Phoenix version?

Thank you!

6 REPLIES 6

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

  1. 0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT * FROM TRADE.TRADE ;
  2. +----------------------------------------------------------------------------+
  3. | PLAN |
  4. +----------------------------------------------------------------------------+
  5. | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1]|
  6. | SERVER FILTER BY FIRST KEY ONLY |

Here it's going through deleted local index that means the data table PTable object is not updated after dropping the local index which is odd. When we drop the index we delete data from the local indexes but somehow the we are getting dropped local index from the meta data.

The reason might be like when we drop the index dropped meta entries have very latest timestamp than 1490706922000. But when we set currentSCN=1490706922000 the meta data might be still visible.

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Expert Contributor

umm ok but isnt the whole point of flashback snapshot query is that the query goes back to that moment in time and any subsequent or current update to hbase will not be reflected in the snapshot query?

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Exactly they won't be reflected.

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Expert Contributor

@Rajeshbabu Chintaguntla right sorry I actually don't understand your first answer. From what I have seen once the index is changed the flash back query no longer works unless I force it to use no index.

So do you mean there is a bug in phoenix or if I waited sufficient time I will see the flashback query works again?

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Expert Contributor

Ok I suppose this article https://community.hortonworks.com/articles/106089/dropping-a-local-index-breaks-scn-query-in-phoenix... kind of suggest it will be fixed in 2.6...

Highlighted

Re: Phoenix flashback / snapshot (currentSCN) query not working when secondary index / indexed data changes

Expert Contributor

So 2 years on I am now using hdp 2.6.4, but is this problem fixed or still an issue?