02-27-2018 05:59 AM
I am currently tuning requests with Impala, in the frame of a study where I will compare different storage formats. My queries are SELECT queries.
When I run these queries under Impala, I get the display on the screen for these SELECT results and a global time, let's say T. I can find this time T under cloudera Manager in 'Impala Query', at the line 'Unregister query' line of 'Query timeline' section.
I would like to know, among the quantity of information I can find there, how I can measure precisely the duration of thr SELECT. I suppose that the display time is also computed, but I would like to know how much the SELECT query costs, without taking into account the display time.
What duration is also the most appropriate to measure query performance in my case ?
Your help would be greatly appreciated.
Thank you in advance. Have a good day.
02-27-2018 08:01 AM
I belive that you will find the best answers of your questions by reading this cloudera document about Understanding Impala Query Performance - EXPLAIN Plans and Query Profiles
02-27-2018 08:13 AM
I highly recommend reading and understanding the Impala Cookbook. It has a section on running benchmarks:
The ClientFetchWaitTimer in the query profile indicates how much time the server is waiting for the client to issue the next fetch. A long time may mean that the client is slow or is not fetching for some other reason.