Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Optimize LLAP Cache & Realtime Stats

Optimize LLAP Cache & Realtime Stats

Hello,

is it possible to have a little more control over what goes in LLAP cache ?. Especially when new data is loaded to HDFS, and LLAP is queried for the first time. Our use case expects the most recently added data to be in the cache ( right now : the property set to true is hive.llap.io.use.lrfu ).

But we want a combination of newly added data in cache & LRFU.

Will @Marcos Da Silva suggestion work for data loaded every few hours?. Or is there a better generalized/approach possible?

select column1,column2 from table where partition_column in

(select max(distinct partition_column)from table)"

Also I don't see grafanna on my Ambari, do I have to install it on all the nodes of the cluster?. To see realtime stats of LLAP?.

Reference URL : https://community.hortonworks.com/questions/85330/how-to-optimize-hive-access-to-the-latest-partitio...

Don't have an account?
Coming from Hortonworks? Activate your account here