Created on 06-03-2017 11:36 PM
PROBLEM: When doing a select of a relatively large table (a few thousand rows) some rows return partially missing.When increasing the filter to return those specific rows, the values appear as expected.
STEPS TO REPRODUCE:
1. Create a table
CREATE TABLE IF NOT EXISTS TEST ( BUCKET VARCHAR, TIMESTAMP_DATE TIMESTAMP, TIMESTAMP UNSIGNED_LONG NOT NULL, SRC VARCHAR, DST VARCHAR, ID VARCHAR, ION VARCHAR, IC BOOLEAN NOT NULL, MI UNSIGNED_LONG, AV UNSIGNED_LONG, MA UNSIGNED_LONG, CNT UNSIGNED_LONG, DUMMY VARCHAR CONSTRAINT pk PRIMARY KEY (BUCKET, TIMESTAMP DESC, SRC, DST, ID, ION, IC) );
2. Use a python script to generate a CSV with 5000 rows
for i in xrange(5000): print "5SEC,2016-07-21 07:25:35.{i},146908593500{i},WWWWWWWW,AAA,BBBB,CCCCCCCC,false,{i}1181000,1788000{i},2497001{i},{i},aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{i}".format(i=i)
3. Bulk inserting the csv in the table
phoenix/bin/psql.py localhost -t TEST large.csv
4. Please see that one row that contains no TIMESTAMP_DATE and null values in MI and MA
0: jdbc:phoenix:localhost:2181> select * from TEST .... +---------+--------------------------+-------------------+-----------+------+-------+-----------+--------+--------------+--------------+--------------+-------+----------------------------------------------------------------------------+ | BUCKET | TIMESTAMP_DATE | TIMESTAMP | SRC | DST | ID | ION | IC | MI | AV | MA | CNT | DUMMY | +---------+--------------------------+-------------------+-----------+------+-------+-----------+--------+--------------+--------------+--------------+-------+----------------------------------------------------------------------------+ | 5SEC | 2016-07-21 07:25:35.100 | 1469085935001000 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 10001181000 | 17880001000 | 24970011000 | 1000 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1000 | | 5SEC | 2016-07-21 07:25:35.999 | 146908593500999 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9991181000 | 1788000999 | 2497001999 | 999 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa999 | | 5SEC | 2016-07-21 07:25:35.998 | 146908593500998 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9981181000 | 1788000998 | 2497001998 | 998 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa998 | | 5SEC | | 146908593500997 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | null | 1788000997 | null | 997 | | | 5SEC | 2016-07-21 07:25:35.996 | 146908593500996 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9961181000 | 1788000996 | 2497001996 | 996 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa996 | | 5SEC | 2016-07-21 07:25:35.995 | 146908593500995 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9951181000 | 1788000995 | 2497001995 | 995 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa995 | | 5SEC | 2016-07-21 07:25:35.994 | 146908593500994 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9941181000 | 1788000994 | 2497001994 | 994 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa994 | ....
5. When selecting that row specifically the values are correct
0: jdbc:phoenix:localhost:2181> select * from TEST where timestamp = 146908593500997; +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+ | BUCKET | TIMESTAMP_DATE | TIMESTAMP | SRC | DST | ID | ION | IC | MI | AV | MA | CNT | DUMMY | +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+ | 5SEC | 2016-07-21 07:25:35.997 | 146908593500997 | WWWWWWWW | AAA | BBBB | CCCCCCCC | false | 9971181000 | 1788000997 | 2497001997 | 997 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa997 | +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+ 1 row selected (0.159 seconds)
SOLUTION : This a known issue and is unresolved as of now. Please track it under PHOENIX-3112.
WORKAROUND: Try increasing value of "hbase.client.scanner.max.result.size" which helped in many cases. But it has its own side effects of inducing memory pressure.
REFERENCES: