Support Questions

Find answers, ask questions, and share your expertise

Partition Pruning without constant in query

avatar
Explorer

Is it possible to take advantage of partition pruning without specifying the value of the partition column as a constant in the query? Is there a way to get it from another joined small table?

 

In this example, table wallet_month has 1 record with column month_of_year, that has value '2016-05'.
Table transaction_detail is partitioned on month_of_year, and has 400 partitions.

 

Looking at the Explain plan, below query still uses all 400 paritions from TD. 

 

SELECT count(*)
FROM wallet_month WM
JOIN transaction_detail TD
AND TD.month_of_year = WM.month_of_year

 

Adding a where clause (WHERE WM.month_of_year = '2016-05') gives the desired behavior of only using 1 partition from TD, but I'd like to know if there's another way.

1 ACCEPTED SOLUTION

avatar

Yes, that's right. It's enabled for some cases by default (broadcast joins) in Impala 2.5. To enable it for a wider category of joins you can set the query option runtime_filter_mode=global.  This setting will become the default in Impala 2.7 because of the performance benefits.

View solution in original post

2 REPLIES 2

avatar
Explorer

Hmm, looks like the functionality I'm looking for is called Dynamic Partition Pruning? (in Impala 2.5 / CDH 5.7)

 

https://dzone.com/articles/apache-impala-incubating-in-cdh-57-4x-faster-for-b

 

 

avatar

Yes, that's right. It's enabled for some cases by default (broadcast joins) in Impala 2.5. To enable it for a wider category of joins you can set the query option runtime_filter_mode=global.  This setting will become the default in Impala 2.7 because of the performance benefits.