Reply
Contributor
Posts: 30
Registered: ‎04-07-2016
Accepted Solution

CodeGen way to slow

Hi, 
using impala 2.7(8) with cdh5.10.1 here. 
I am trying a simple query : 
`select distinct(date_col_partition) from table_1`

and it is taking 20 sec. 
But When I do a set DISABLE_CODEGEN=true;

It take only less than a second. 

 

here is the profle gist: https://gist.github.com/anonymous/1a5faa3a10d4495f7b8abc3c964457db

 

Any idea of what is going wrong?

 

thanks

Cloudera Employee
Posts: 251
Registered: ‎10-16-2013

Re: CodeGen way to slow

Hi Maurin,

 

thanks for posting, this is pretty interesting. What is the type of your "cuberon_event_date" column?

 

Alex

Cloudera Employee
Posts: 251
Registered: ‎10-16-2013

Re: CodeGen way to slow

As an experiment, it would be interesting to try the query with the same data using a different data format, e.g., text. You can do a quick CREATE TABLE test as SELECT * FROM <original_table> and the retry the query.

Contributor
Posts: 30
Registered: ‎04-07-2016

Re: CodeGen way to slow

[ Edited ]

It is a string of that look like "YYYY-MM-DD"
the table is stored as avro. I can try using parquet or text if you want

Cloudera Employee
Posts: 251
Registered: ‎10-16-2013

Re: CodeGen way to slow

Thanks. Trying Parquet would help. Just want to see of the high optimization time in codegen is due to some glitch for Avro.

Cloudera Employee
Posts: 186
Registered: ‎07-29-2015

Re: CodeGen way to slow

Does the table have a lot of columns or anything unusual like that?

Contributor
Posts: 30
Registered: ‎04-07-2016

Re: CodeGen way to slow

it seems to be coming from avro. 
I created the table as parquet and it took 0.48sec.
The table have about 900 columns, so nothing to fancy.

 

thanks

Cloudera Employee
Posts: 251
Registered: ‎10-16-2013

Re: CodeGen way to slow

Thanks for investigating. We've confirmed internally that the issue is related to Avro with many columns. 900 is somewhat wide.

 

Thanks for reporting! We'll continue to look into this issue.

Contributor
Posts: 30
Registered: ‎04-07-2016

Re: CodeGen way to slow

thanks!
If you open a jira, can you send me the link?
I will probably disable codegen for now. And wait until you push a fix to re enable it. 
thanks

Cloudera Employee
Posts: 251
Registered: ‎10-16-2013

Re: CodeGen way to slow

Announcements