I am created a view on top of a table and it give me really bad performance results. It take about 2min to "Plan" the query on the views whereas it is only taking 55ms in the raw table.
The views is really jsut a select * from table_a union select * from table_b. Where table a and b have the same definition and are both parquet tables...
Any idea what is happening?
Computing equivalences classes is known to be slow for complex nested queries. One issue is that your views do a "select *" and if those tables are wide (>100 columns) then that can dramatically affect the planning time.
A workaround is to select only the columns you need. I realize thia may be impractical soemtimes.
Still, a lot of time is spend during single-node planning. This is the interesting snippet from your slow profile:
Planner Timeline: 2m6s
- Analysis finished: 1s009ms (1s009ms)
- Equivalence classes computed: 1m9s (1m8s) <--- slow
- Single node plan created: 2m6s (56s716ms) <--- also slow, why?
- Runtime filters computed: 2m6s (8.364ms)
- Distributed plan created: 2m6s (46.080ms)
- Lineage info computed: 2m6s (1.754ms)
- Planning finished: 2m6s (37.579ms)
Would you be willing to share your table/view schemas for further investigation?
Thanks for your answer.
I am not a huge fan of giving the exact column names, but I can tell you that both tables are paritioned by data in string format "YYYY-MM-DD" and 1798 columns of which 1795 are strings and 3 ints.
I hope that can help.