In the new CDH 5.5 documentation, it is stated that:
Note: Alhough Impala can be used together with YARN via simple configuration of Static Service Pools in Cloudera Manager, the use of the general-purpose component Llama for integrated resource management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.
Does it mean it is not recommended to use Llama and YARN, and let Impala manage its resources outside of YARN?
I have the same question, I installed impala2.3 and llama 1.0 (CDH5.5) in an existed cluster(CDH5.4), if I set enable_rm flag to true, the query will be pending and timeout, but it will be ok if without enable_rm
so, anyone can tell us?
We don't recommend that you use the Impala/Llama integration at this stage. Impala/Llama dynamically acquires resources from YARN for each query that runs, but we've found that the stability and predictability isn't where it needs to be.
The recommendation is to give Impala a static set of resources that it can allocate to queries internally. Those resources can be YARN-managed resources, or resources outside of YARN.
This simplest option is to run Impala independently of YARN, by allocating it a fixed amount of resources.
If you want to integrate with YARN without using the Impala/Llama integration, you can use Static Service Pools: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_mc_service_pools...
Can you please confirm/deny the following conclusions I have made by reading the new 5.5 docs, your above post and testing "Static Service Pools" in a test environment:
1. Impala's integration with YARN through Llama is being discontinued from CDH 5.5/Impala 2.3 onwards?
2. Configuring "Static Service Pools" means that both Impala and YARN (and other services too) are managed by cgroups rather than Impala being managed by YARN?
If point # 2 is true, it basically means that Impala can no longer be managed by YARN CDH 5.5 onwards. I tried enabling the "Static Service Pools" but I do not see the Impala queries I ran in YARN Applications.
Indeed, this is correct. You will not see the queries as YARN applications. You can use Admission Control within Impala for resource management within the Impala Static Service Pool
According to one of the recently published article named "
New SQL Benchmarks: Apache Impala (incubating) Uniquely Delivers Analytic Database Performance", Cloudera's roadmap for 2016 suggests better improvement for YARN integration. Does that mean Cloudera is going to continusly supporting LLAMA+YARN???