Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

AtScale Concurrency

avatar

Outside of YARN queues, more node managers and HS2, is there a rule of thumb for scaling AtScale with more concurrent users? Does the Hybrid Query Service and Cache Manager have any scaling limits?

1 ACCEPTED SOLUTION

avatar

Hi @ccasano

Current limits are more tightly related to query performance because queries that take a long time can keep open threads which only serve to backup other users. So getting queries to Adaptive Cache and improving query performance is important.

That being said, there will be some clients that will have enterprise grade concurrency requirements. When this is the case, the recommendation is to stand up additional AtScale nodes which will synchronize and have a load balancer in front. Of course this also assumes that the increased concurrency demand can be serviced by the HDP cluster leveraging the Adaptive Cache.

Hope this helps.

View solution in original post

3 REPLIES 3

avatar

Hi @ccasano

Current limits are more tightly related to query performance because queries that take a long time can keep open threads which only serve to backup other users. So getting queries to Adaptive Cache and improving query performance is important.

That being said, there will be some clients that will have enterprise grade concurrency requirements. When this is the case, the recommendation is to stand up additional AtScale nodes which will synchronize and have a load balancer in front. Of course this also assumes that the increased concurrency demand can be serviced by the HDP cluster leveraging the Adaptive Cache.

Hope this helps.

avatar

@ccasano I'd also like to add to my comment that default LLAP leverages the LRFU algorithm with a pre-emption strategy on the Frequently Used side. This means that LLAP will always preempt long running queries in favor of short, adhoc queries yet it still will allow for the occasional "bring-back-all-the-data" scenario without flushing adhoc query cache. This allows for better query concurrency and provides optimal performance for the majority of BI workloads.

In addition, LLAP does not use YARN containers which would limit concurrency since each container is a user session, aka job. Most BI use cases involve many users running ad hoc queries and then keeping their session open as they look at the reports. By leverages TEZ AM for queries LLAP gets much higher concurrency.

The combination of LLAP's LRFU algorithm, use of TEZ AM, caching, and AtScale's Adaptive Cache, users get a nice boost in performance and concurrency out-of-the-box.

avatar
@Scott Shaw

Thanks Scott. This helps for now in that there are other factors we have to include when sizing / estimating for concurrency.