Created on 04-20-2017 09:08 PM - edited 09-16-2022 04:29 AM
Hi,
using impala 2.7(8) with cdh5.10.1 here.
I am trying a simple query :
`select distinct(date_col_partition) from table_1`
and it is taking 20 sec.
But When I do a set DISABLE_CODEGEN=true;
It take only less than a second.
here is the profle gist: https://gist.github.com/anonymous/1a5faa3a10d4495f7b8abc3c964457db
Any idea of what is going wrong?
thanks
Created 04-20-2017 11:14 PM
Thanks for investigating. We've confirmed internally that the issue is related to Avro with many columns. 900 is somewhat wide.
Thanks for reporting! We'll continue to look into this issue.
Created 04-20-2017 09:42 PM
Hi Maurin,
thanks for posting, this is pretty interesting. What is the type of your "cuberon_event_date" column?
Alex
Created 04-20-2017 09:45 PM
As an experiment, it would be interesting to try the query with the same data using a different data format, e.g., text. You can do a quick CREATE TABLE test as SELECT * FROM <original_table> and the retry the query.
Created on 04-20-2017 09:56 PM - edited 04-20-2017 09:57 PM
It is a string of that look like "YYYY-MM-DD"
the table is stored as avro. I can try using parquet or text if you want
Created 04-20-2017 10:06 PM
Thanks. Trying Parquet would help. Just want to see of the high optimization time in codegen is due to some glitch for Avro.
Created 04-20-2017 10:27 PM
Does the table have a lot of columns or anything unusual like that?
Created 04-20-2017 11:10 PM
it seems to be coming from avro.
I created the table as parquet and it took 0.48sec.
The table have about 900 columns, so nothing to fancy.
thanks
Created 04-20-2017 11:14 PM
Thanks for investigating. We've confirmed internally that the issue is related to Avro with many columns. 900 is somewhat wide.
Thanks for reporting! We'll continue to look into this issue.
Created 04-21-2017 01:32 AM
thanks!
If you open a jira, can you send me the link?
I will probably disable codegen for now. And wait until you push a fix to re enable it.
thanks
Created 04-21-2017 07:50 AM