Created on 08-09-2018 03:22 PM - edited 09-16-2022 06:34 AM
How do we control the number of cores used by impala. Currently we are running yarn and impala work load but impala queries are cpu intensive and ends up using most of the available cpu cores on the system.
Created 08-09-2018 03:35 PM
Also we have set --num-scanner-threads to 52 assuming 1 thread per core. We have 28 core cpu with hyper threading enabled.
Created 08-10-2018 04:51 AM
There are different methos, below is the default settings,
if you go to CM-> host -> select a host -> resources (menu)
It will show you how many resources (cpu, mem, etc) has to be allocated for Yarn, Impala, hdfs, etc per node
You can control them using the below,
a. CM -> Yarn -> Config -> click on "Nodemanager" (left) and "Resource Management" (left) -> Consider cpu or mem as needed
b. CM -> Impala -> Config -> Click on "Imapal Daemon" (left) and "Resource Management" (left) -> consider only cpu.shares, mem_limit as needed.
If no luck then you can use dynamic resource pooling and create different job queue for MR & Impala
Created 08-13-2018 07:09 AM
assigning resources under resources tab will be per host. We need to limit cpu at impala cluster level.
And we are not using static pools we are using dynamic pools cpu.shares will not be helpful i believe.