Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Impala, Llama, and YARNon CDH 5.5

Impala, Llama, and YARNon CDH 5.5

New Contributor

Hi All,

 

In the new CDH 5.5 documentation, it is stated that:

 

Note: Alhough Impala can be used together with YARN via simple configuration of Static Service Pools in Cloudera Manager, the use of the general-purpose component Llama for integrated resource management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.

 

Does it mean it is not recommended to use Llama and YARN, and let Impala manage its resources outside of YARN?

 

 

 

Thanks.

6 REPLIES 6
Highlighted

Re: Impala, Llama, and YARNon CDH 5.5

Explorer

I have the same question, I installed impala2.3 and llama 1.0 (CDH5.5) in an existed cluster(CDH5.4), if I set enable_rm flag to true, the query will be pending and timeout, but it will be ok if without enable_rm

so, anyone can tell us?

Re: Impala, Llama, and YARNon CDH 5.5

Master Collaborator

We don't recommend that you use the Impala/Llama integration at this stage. Impala/Llama dynamically acquires resources from YARN for each query that runs, but we've found that the stability and predictability isn't where it needs to be.

 

The recommendation is to give Impala a static set of resources that it can allocate to queries internally. Those resources can be YARN-managed resources, or resources outside of YARN.

 

This simplest option is to run Impala independently of YARN, by allocating it a fixed amount of resources.

 

If you want to integrate with YARN without using the Impala/Llama integration, you can use Static Service Pools: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_mc_service_pools...

Re: Impala, Llama, and YARNon CDH 5.5

Contributor

Can you please confirm/deny the following conclusions I have made by reading the new 5.5 docs, your above post and testing "Static Service Pools" in a test environment:


1. Impala's integration with YARN through Llama is being discontinued from CDH 5.5/Impala 2.3 onwards?


2. Configuring "Static Service Pools" means that both Impala and YARN (and other services too) are managed by cgroups rather than Impala being managed by YARN?


If point # 2 is true, it basically means that Impala can no longer be managed by YARN CDH 5.5 onwards. I tried enabling the "Static Service Pools" but I do not see the Impala queries I ran in YARN Applications. 

 

Re: Impala, Llama, and YARNon CDH 5.5

Cloudera Employee

ccahadoop,

 

Indeed, this is correct.  You will not see the queries as YARN applications.  You can use Admission Control within Impala for resource management within the Impala Static Service Pool

 

-Matt

Re: Impala, Llama, and YARNon CDH 5.5

Explorer

mattschumpert

 

According to one of the recently published article named "

New SQL Benchmarks: Apache Impala (incubating) Uniquely Delivers Analytic Database Performance", Cloudera's roadmap for 2016 suggests better improvement for YARN integration. Does that mean Cloudera is going to continusly supporting LLAMA+YARN??? 

Re: Impala, Llama, and YARNon CDH 5.5

New Contributor
do you set the rm_always_use_default = True. I set the enable_rm = true and rm_always_use_default = false, and donot encounter your problem.