Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do I make sense of a Phoenix query explain plan?

How do I make sense of a Phoenix query explain plan?

I'm looking for a generic guide on tuning Phoenix queries as well as specific answers to the below questions:
  1. Is there a table showing how to compare different keyword in explain plans? Ex: “PARALLEL 1-WAY ROUND ROBIN FULL SCAN” versus “PARALLEL 9-WAY FULL SCAN”, which one would be faster if everything else was the same?
  2. Is there a way to determine if all of my RegionServers are being used during the execution of a particular query?
  3. What is the 9-CHUNK in “CLIENT 9-CHUNK 4352857 ROWS 943718447 BYTES” mean? Are more CHUNKs better or worse?
  4. What is the impact of using functions, such as HOUR(ts) on query execution time? Is the impact 1%, 10%, 50%, etc…?
3 REPLIES 3

Re: How do I make sense of a Phoenix query explain plan?

Expert Contributor

Please check below links are help full

1) Explain plan Jira:- https://issues.apache.org/jira/browse/PHOENIX-2702

2) FAQ for Phoenix :- https://phoenix.apache.org/faq.html

Highlighted

Re: How do I make sense of a Phoenix query explain plan?

Expert Contributor
  1. “PARALLEL 9-WAY FULL SCAN” - 9 is the salt_bucket you specified in you create table or index correct? From my limited experiences on sandbox it seems 32 is marginally faster than 8. I am yet to try to give it more and see if it would be faster or not. Also this is a full scan, if you can get it into "RANGE SCAN", it should be faster:
+---------------------------------------------------------------------------------------------+
|                                            PLAN                                             |
+---------------------------------------------------------------------------------------------+
| CLIENT 32-CHUNK PARALLEL 32-WAY ROUND ROBIN RANGE SCAN OVER SCHEMA:TABLE [1,'PREDICATE']    |
|     SERVER FILTER BY FIRST KEY ONLY                                                         |
+---------------------------------------------------------------------------------------------+

3. It seems "CHUNK" goes together with salt bucket as well. Please correct me if I am wrong.

Re: How do I make sense of a Phoenix query explain plan?

  1. Is there a table showing how to compare different keyword in explain plans? Ex: “PARALLEL 1-WAY ROUND ROBIN FULL SCAN” versus “PARALLEL 9-WAY FULL SCAN”, which one would be faster if everything else was the same?

https://phoenix.apache.org/tuning_guide.html

  1. Is there a way to determine if all of my RegionServers are being used during the execution of a particular query?

Currently with explain plan ,there is no way to determine that all your regionservers are getting used for the query , although if there is full table scan ,then all the regions will be scanned but still you can't be sure that all regionservers are getting used as it depends upon your balancer policy and the distribution can be checked from master UI.

  1. What is the 9-CHUNK in “CLIENT 9-CHUNK 4352857 ROWS 943718447 BYTES” mean? Are more CHUNKs better or worse?

https://phoenix.apache.org/tuning_guide.html

  1. What is the impact of using functions, such as HOUR(ts) on query execution time? Is the impact 1%, 10%, 50%, etc…?

it depends , for eg:- if you are applying hour(ts) where 'ts' is the first part of the primary key and you don't have functional index on it, then the query may be turned into full table scan where the impact range could be anything.

although, processing time of evaluating the function on the column value depends upon the function complexity and no. of rows you are scanning.