Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

long time unaccounted for in Impala query planning phase

avatar
New Contributor

Hi,

 

I ran an impala query and noticed that the planning phase took a long time but I couldn't figure out where the time is spent. Consider the following Planner Timeline and Query Timeline from the query profile. From the code I can see that the Planner Timeline happens between the "Query submitted" and "Planning finished" milestones in the Query Timeline. However, the Planner Timeline is about 1.4s, whereas the duration between "Query submitted" and "Planning finished" is about 56s. Does anyone have any idea what might be the cause for this discrepency?

 

In my investigation I came across this thread: https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-Performance-Issue-Diagnosis-Hel..., which mentioned "a very long 'planning time' often indicates that the query is bottlenecked on loading/refreshing the table metadata." Are there any KPIs in the query profile or else where that would indicate that loading/refreshing table metadata is the bottleneck? And if so, are there best practices (e.g., tuning some config parameters) to improve this issue? 

 

Thanks,

 

Eric

 

 

Planner Timeline
Analysis finished: 96,331,949
Equivalence classes computed: 917936038
Single node plan created: 1,251,725,380
Runtime filters computed: 1,255,365,766
Distributed plan created: 1,266,640,723
Lineage info computed: 1272454499
Planning finished: 1,406,644,473

 

Query Timeline
Query submitted: 56,292
Planning finished: 56,509,844,912
Submit for admission: 56,522,498,928
Completed admission: 56,522,824,260
Ready to start 109 fragment instances: 56,530,852,572
All 109 fragment instances started: 57,739,824,848
Rows available: 101,318,335,128
First row fetched: 101,441,023,332
Unregister query: 165,161,832,544

Who agreed with this topic