is it possible to have a little more control over what goes in LLAP cache ?. Especially when new data is loaded to HDFS, and LLAP is queried for the first time. Our use case expects the most recently added data to be in the cache ( right now : the property set to true is hive.llap.io.use.lrfu ).
But we want a combination of newly added data in cache & LRFU.
Will @Marcos Da Silva suggestion work for data loaded every few hours?. Or is there a better generalized/approach possible?
select column1,column2 from table where partition_column in
(select max(distinct partition_column)from table)"
Also I don't see grafanna on my Ambari, do I have to install it on all the nodes of the cluster?. To see realtime stats of LLAP?.