Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to utilise all cores in Impala ?

Unable to utilise all cores in Impala ?

Explorer

Hello,

 

I am checking impala performance on multiple systems with different hardware configuration, while testing in on 8 core system I see only 1 core touches 100% CPU while remaining cores are idle when executing select query , I am not able to use the entire hardware so how can I make all cores work for impala ?

I had used htop to check cpu usage.


Currently my cluster is single node, 12gb RAM, 8core cpu.

I am using latest Cloudera VM.

 

I checked impala and hdfs configs but I found only Cgroup CPU Shares = 1024, is there some parameter which defines no. of cores to be used ?

 

Thanks 

punshi

1 REPLY 1
Highlighted

Re: Unable to utilise all cores in Impala ?

Master Collaborator

@punshithis is very dependent on the specific queries and the workload as a whole (i.e. concurrent queries). Some operators in Impala - mainly scans - are always parallelised, so if the query is mostly scan-intensive. Joins and aggregates are not parallelised within a node, so if those are the bottleneck for queries and you are only running one query at at time, then you may only see a single core utilised. We typically see CPU saturated on production workloads with concurrent queries - usually production clusters have no issue saturating CPU.

 

We have some long-term plans to run all operators with a configurable degree of parallelism.

 

I'll note that this is pretty standard for analytical databases - most systems won't let a single query use all the system resources by default and in configurations used for production.

Don't have an account?
Coming from Hortonworks? Activate your account here