Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

impala 2.8 mt_dop observation on perfs on views with union

Highlighted

impala 2.8 mt_dop observation on perfs on views with union

Contributor

Hi,

 

I have a view defined as `select * from t1 union select * from t2`. The 2 tables have an identical schema, t2 contains no data. 

 

Now 3 uses cases : 

- query with mt_dop=10 on the view  : 

query is slow ~10s https://gist.github.com/momohuri/9ad4ba8f6fbd1d180068c8c102291f69

- query with mt_dop=0 on the view : 
query is fast ~1s https://gist.github.com/momohuri/c4347eb7ef70a8eec63a0a62638d1ce7

- query with mt_dop=10 not using a view/union : 
query is fast  ~1sec https://gist.github.com/momohuri/62a00d9e381771aa9172b6dea09a5191

I hope those observations can help for the next releases :)

 

thanks

 

 

 

5 REPLIES 5

Re: impala 2.8 mt_dop observation on perfs on views with union

Master Collaborator

Thanks for the interesting data - it seems like in this case mt_dop didn't do too well.

 

Two observations:

* It doesn't seem like mt_dop is enabled in the last profile - I don't see it in "Query Options"

* Most of the time in the query is spent in the scan, which is multi-threaded even with mt_dop disabled - with mt_dop it always executes with 10 cores, but with mt_dop disabled it dynamically varies the number of cores, up to the number of CPUs on the machine. So with mt_dop disabled, it may actually be running with higher parallelism.

Re: impala 2.8 mt_dop observation on perfs on views with union

Contributor

Hey. 

 

You are right for the last profile, my bad...

Lets do all 4 uses cases this time.

Here are the profiles on the raw table :
no mt_dop on table :  https://gist.github.com/momohuri/0b91468ae2526c4f5b0c4dba09172147

take about the same time as:

mt_dop=10 on table: https://gist.github.com/momohuri/62fb85bb251490b01d0ee7032f377cef

 

For the view 
no mt_dop (it still take the same amout of time as the table): https://gist.github.com/momohuri/5954233819c54376c85c795b006f91ad

mt_dop=10 (it take 5x time): https://gist.github.com/momohuri/2263742803326fc2467e96a2e2a9cb5a

 

 

and you are right on the observation, the main difference in the profile/summary that I see, is the time spent on hdfs.

 

thanks

 

PS: Also an other observation, if I create view of parquet + Avro, and query it with mt_dop=10,it seems to be doing the equivalent of mt_dop=0. I didn't ran extensive test on it. but that's my first impression

Re: impala 2.8 mt_dop observation on perfs on views with union

Master Collaborator

Hi Maurin,

 

thanks for your report. The reason why MT_DOP is not kicking as much as you might wish for some queries, is that UNION is currently not parallelized with MT_DOP. When run a query with UNION and MT_DOP=X, you are actually running with the equivalent of MT_DOP=1 (not MT_DOP=0).

 

It's an unfortunate limitation that we are working on addressing. Even better, we probaby should disable union queries with mt_dop to pre-empt misunderstandings like this :)

 

Your report on Avro is interesting, and I'd like to learn more about it. Did you try queries without a union and Avro with MT_DOP performed poorly? We definitely want to know about that and appreciate your help if you have input.

 

Alex

Re: impala 2.8 mt_dop observation on perfs on views with union

Contributor

"It's an unfortunate limitation that we are working on addressing

I would greatly appreciate if you could do it :), with a lambda architecture, most of my queries are hitting a "union" view.

 

For Avro you are right, it was also because of the union, I just tried with MT_DOP and it gave me expected results.

 

thanks

Re: impala 2.8 mt_dop observation on perfs on views with union

Master Collaborator

Thanks for trying it out and reporting your findings! Always helpful to get feedback from actual users :-)

 

We'll certainly continue to lift some of the current limitations, including the union issue.

Don't have an account?
Coming from Hortonworks? Activate your account here