Support Questions

GopiG · ‎07-29-2020

I am trying to create a range partition on timestamp column by year on a KUDU table. I could find a solution yet. The table have hash partition on primary keys and have 131 M records. I have a requirement to extract the records created/updated in last 6 months. I assume range partition on the lastupdateddate column may help to fetch the data faster as it would avoid full table scan. Appreciate your thoughts.

Tim Armstrong · ‎07-29-2020

Yes we should be able to prune based on range partitions. https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_kudu.html#kudu_partitioning has some examples of how to set up a table with both range and hash partitions. You can specify arbitrary timestamp ranges for the partitions.

You can see in the Impala explain plan if your WHERE predicates were converted into kudu pushdown predicates (they're labelled kudu predicates).

Cloudera Community

Support Questions

Kudu Partition on Timestamp column