Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

impala 'between predicate' not push down to kudu?

Highlighted

impala 'between predicate' not push down to kudu?

Explorer

We are running kudu 1.3.0 with cdh 5.10(the kudu client version suppose to be 1.2).

When we doing tpc-ds query with impala on kudu(according to https://github.com/cloudera/impala-tpcds-kit), we found that the 'query 3 between predicate' is not push down to kudu, cause kudu scan many rows return to impala.

The following is what we found in impala query profile:

predicate.png

 

tpc-ds q3.sql snippets:

predicate-query.png

 

any reply will be appreciate.

 

3 REPLIES 3

Re: impala 'between predicate' not push down to kudu?

Explorer
while reading the "using impala with kudu" document, it's saying that: "If the WHERE clause of your query includes comparisons with the operators =, <=, '\<', '\>', >=, BETWEEN, or IN, Kudu evaluates the condition directly and only returns the relevant results. This provides optimum performance, because Kudu only returns the relevant results to Impala."

But here, with tpc-ds query3, between predicate is not push down to kudu.
Is that anything wrong?

Re: impala 'between predicate' not push down to kudu?

Explorer
Finally I found that 'or' predicate will not push down to kudu:
explain select * from student where age=10 or age=20 or age=50 or age=60;
+------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=0B VCores=1 |
| WARNING: The following tables are missing relevant table and/or column statistics. |
| preresearch.student |
| |
| PLAN-ROOT SINK |
| | |
| 01:EXCHANGE [UNPARTITIONED] |
| | |
| 00:SCAN KUDU [preresearch.student] |
| predicates: age = 10 OR age = 20 OR age = 50 OR age = 60 |
+------------------------------------------------------------------------------------+

Re: impala 'between predicate' not push down to kudu?

Cloudera Employee

Hi @lewiss, that's correct, currently OR (disjunctive) predicates can't be pushed to Kudu.  In theory Impala could rewrite this query to be a union between a bunch of disjoint sub-selects each using a BETWEEN predicate, but I think that optimization is currently missing (it's not something that can be done in general, since the result sets could overlap).

Don't have an account?
Coming from Hortonworks? Activate your account here