Support Questions

Find answers, ask questions, and share your expertise

Kudu Partition on Timestamp column

avatar
New Contributor

I am trying to create a range partition on timestamp column by year on a KUDU table. I could find a solution yet. The table  have hash partition on primary keys and have 131 M records. I have a requirement to extract the records created/updated in last 6 months. I assume range partition on the lastupdateddate column may help to fetch the data faster as it would avoid full table scan. Appreciate your thoughts.  

 

1 REPLY 1

avatar

Yes we should be able to prune based on range partitions. https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_kudu.html#kudu_partitioning has some examples of how to set up a table with both range and hash partitions. You can specify arbitrary timestamp ranges for the partitions.

 

You can see in the Impala explain plan if your WHERE predicates were converted into kudu pushdown predicates (they're labelled kudu predicates).