We've set 'phoenix.stats.updateFrequency' to 10 mins(and verified that this property exists in hbase-site.xml and restarted Hbase), still STATS are not being updated automatically.
select * from SYSTEM.stats where "PHYSICAL_NAME"='TABLE_NAME' and "GUIDE_POSTS_WIDTH" is null;
LAST_STATS_UPDATE_TIME : 2018-06-06 20:30:30
Current time :2018-07-07 13:20:26
But if i execute UPDATE STATISTICS command manually, thenSTATS are being updated and LAST_STATS_UPDATE_TIME is also being updated to latest time.
What problem we're getting due to this : Whenever someone queries Phoenix table, then for the first time query takes so long(sometimes more than 15 secs) . and further queries becomes fast for next few(15) mins. We checked server logs for long response time and found out that TableStatsCache get expired and it tries to reload STATS cache in same thread and becomes slow. And we think that if stats get updated regularly, then queries will be faster as it will not go for Updating STATS cache.
Please check and let us know if something more is required for Auto stats update.
I checked region server logs and couldn't found any errors.
But one difference found when update stats are triggered manually.
Logs when i trigger manually by command 'UPDATE STATISTICS...':
2018-06-08 16:30:13,389 INFO [RpcServer.FifoWFPBQ.default.handler=28,queue=1,port=16020-SendThread()] zookeeper.ClientCnxn: Session establishment complete on server , sessionid = 12345, negotiated timeout = 60000 2018-06-08 16:30:23,075 INFO [phoenix-update-statistics-3] coprocessor.UngroupedAggregateRegionObserver: UPDATE STATISTICS finished successfully for scanner: org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl@27ab2aa8.... 2018-06-08 16:30:23,076 INFO [phoenix-update-statistics-3] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=12345 2018-06-08 16:30:23,078 INFO [phoenix-update-statistics-3] zookeeper.ZooKeeper: Session: 12345 closed
(There are two more similar threads in the logs in same chain)
Logs for other instances of update stats( as par thread id its trigerred due to compactions):
2018-06-08 16:39:52,308 INFO [regionserver-shortCompactions-22-SendThread()] zookeeper.ClientCnxn: Session establishment complete on server , sessionid = 12346 2018-06-08 16:39:52,538 INFO [phoenix-update-statistics-2] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=12346 2018-06-08 16:39:52,540 INFO [phoenix-update-statistics-2] zookeeper.ZooKeeper: Session: 12346 closed.
(There is one more similar thread in same chain).
For other instances: Number of threads are different, ThreadId is different and also message is different.
So, it seems that Auto update of STATS is not triggering?
"UPDATE STATISTICS" will not run at the frequency specified by "phoenix.stats.updateFrequency" as this property is just to ensure that local client cache is refreshed with the stats from the SYSTEM.STATS at this frequency.
So, you still need to run "UPDATE STATISTICS" or compaction to have your stats updated.
Sorry , I know that documentation on apache for this property is little confusing, I'll update it soon for more clarity.
But Client cache is also not being refreshed automatically. As its mentioned in my original question when client queries some data , then only it goes for update and that particular request(query) takes long to execute( as it's refreshing cache as well) and after that for around 15 mins all queries works fine.
Is there any way to automatically refreshes cache , so that it will not refresh in same client query thread.
Is there any way to update client cache automatically Or what can be side effects if we keep 'UPDATE_CACHE_FREQUENCY' to 'NEVER' or very high value?
As par phoenix documentation on UPDATE_CACHE_FREQUENCY : "A millisecond value indicates how long the client will hold on to its cached version of the metadata before checking back with the server for updates" http://phoenix.apache.org/language/index.html#options
So,. i set the value by following command(and i restarted client application as well):
Alter table <TABLE_NAME> set UPDATE_CACHE_FREQUENCY=300000;
But it does not seems to be working(client is not updating cache). Because i sent the request to client app (for this table's data) after 15 minutes and got same 'Cache Expired' message, rather cache should have been updated automatically every 5 minutes and it shouldn't have been updated in client query thread.
Please update if my understanding is incorrect or something is missing