i run a particular query on daily basis to generate feed file from data.my query is complex with lots of joins with 10-12 tables. but it behaves weired and have done everything to resolve but cant get any clue or solution.
the below is from impala browser, i ran for the date 04-04-2017
Query Type Start Time End Time Duration Scan Progress State # rows fetched Details
|QUERY||2017-05-09 13:04:36.053536000||2017-05-09 13:05:22.669842000||46s616ms||115 / 115 ( 100%)||FINISHED||48731|
the below i ran for 05-05-2017,it returns the record till 30720 and after that it stucks and when i see memory by clicking on details memory keeps on growing but existing progress 105/115(91.30435) never changes.in last event it says finished but query still remians in flight.
Query Type Start Time Duration Scan Progress State Last Event # rows fetched Details
|UERY||2017-05-09 13:10:38.934226000||25m3s||105 / 115 (91.3043%)||FINISHED||First row fetched||30720|
this is again an issue in PROD environement. we are facing this issue since 05-05-2017. and no solution yet. a very embarissing situation for us.
Kindly help to resolve it.
i am using : impalad version 2.7.0-cdh5.9.0
I would suggest looking at the execution summary or profile to understand where the time is going. The progress only measures progress of the table scans, so this is consistent with the time is being spent in joins (or other operations) after the table scans.
You probably just have a very large join. Could be that the join order in the query plan is not optimal or maybe you're just running the query on too much data for the cluster size.
Usually the troubleshooting steps are something like:
Thanks for the reply Tim.
hey i checked, all is good. i do compuet stats,refresh on all impalad. memory is plenty. but still the problem cant understand.
also noticed that sometimes it gets completed after 10/20 mins after showing 100% completion. this time it completed and di not hang.
|QUERY||2017-05-25 23:52:44.908690000||2017-05-26 00:44:44.421065000||51m59s||171 / 171 ( 100%)||FINISHED||6580||Details|
the max time which i can see in summary for a particular opration is 927ms.others itmes also taking like 123 ms and so. so cant see it is spemding much time anywhere.
have seen the execution summary ,its taking 15 mins and after that query is getting cancelled.have already done compute stats for involved tables. there is no memory problem as it shows consumption 340 MB out of 35 GB of memory.