Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do I use LLAP SSD cache?

Highlighted

How do I use LLAP SSD cache?

New Contributor

I test LLAP cache using SSD. I set hive.llap.io.allocator.mmap to true, and set hive.llap.io.allocator.mmap.path to SSD mount directory '/data1'. Single LLAP daemon memory is about 59GB and Xmx is about 57GB, and cache is about 48GB.

Then LLAP is started successfully and local directory '/data1/llap-6454532602074638940' is created. As Shown below:

109917-1563261600030.png

I did TPC-DS test with 200GB Text data. I observe the Cache Metrics by 15002 port as well as Grafana. As Shown below:

109922-1563258870369.png


109916-offheap.jpg


We could see about 41 GB cache was generated on single LLAP daemon after several TPC-DS SQL queries. However,I found no data produced in local SSD cache directory '/data1/llap-6454532602074638940', and in other words, this directory was always empty.

I also observed the memory usage of physical machine when llap was running. As shown below:

109923-1563259999303.png


As well as the memory usage of physical machine when llap was turned off. As shown below:

109915-1563260189167.png


From these two pictures, we can see it seems that single llap daemon hold about 80GB memory(59GB llap daemon memory was configured in fact) as well as 20GB operating system cache.


So here's the problem:

1) How does LLAP use SSD caching? Why is there no data in SSD local mount directory?

2)Why does SSD use more memory than it actually sets up? Is the data still cached in memory instead of SSD?

3)How to use LLAP SSD cache correctly? How do I monitor llap cache usage?


Thanks very much!