How do we control the number of cores used by impala. Currently we are running yarn and impala work load but impala queries are cpu intensive and ends up using most of the available cpu cores on the system.
There are different methos, below is the default settings,
if you go to CM-> host -> select a host -> resources (menu)
It will show you how many resources (cpu, mem, etc) has to be allocated for Yarn, Impala, hdfs, etc per node
You can control them using the below,
a. CM -> Yarn -> Config -> click on "Nodemanager" (left) and "Resource Management" (left) -> Consider cpu or mem as needed
b. CM -> Impala -> Config -> Click on "Imapal Daemon" (left) and "Resource Management" (left) -> consider only cpu.shares, mem_limit as needed.
If no luck then you can use dynamic resource pooling and create different job queue for MR & Impala
assigning resources under resources tab will be per host. We need to limit cpu at impala cluster level.
And we are not using static pools we are using dynamic pools cpu.shares will not be helpful i believe.