Reply
New Contributor
Posts: 5
Registered: ‎02-09-2017
Accepted Solution

Impala/Kudu non-covering range partition to support rolling window data retention

Hi Impala/Kudu gurus,

 

I'm extremely excited by the new Impala/Kudu release that supports non-covering range partition, as described here: https://github.com/cloudera/kudu/blob/master/docs/design-docs/non-covering-range-partitions.md

and here: https://gerrit.cloudera.org/#/c/4856/

 

Yet I haven't figured out how exactly to use it to support rolling window data retention that our business needs. The syntax descibed in the 2nd document above still seems to require static partition specification.

 

What we need is the ability to auto-create new partitions based on a timestamp expression so that each partition contains x days of data only. We then can drop the old partitions based on our data retention policy on a per table basis.

 

As a comparison, the similar function is provided by Oracle's range interval partition:

 

 

PARTITION BY RANGE (CREATION_DATE)
INTERVAL (NUMTODSINTERVAL(7, 'DAY'))

and Vertica's partition key expression:

 

PARTITION BY (floor((((tbl.creation_ts)::date - '0001-12-31 BC'::date) / 3)))

 

Thanks,

Brian

Cloudera Employee
Posts: 16
Registered: ‎12-19-2013

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Hi Brian,

 

Unfortunately Kudu partitions must be pre-defined as you suspected, so the Oracle syntax you described won't work for Impala. However, you can add and drop range partitions even after the table is created, so you can manually add the next hour/day/week partition, and drop some historical partition. The syntax is described in the latest version of the CDH documentation:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_kudu.html#kudu_range_partitio...

 

Best,

Matt

New Contributor
Posts: 5
Registered: ‎02-09-2017

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Hi Matt,

 

It seems we are going down this path for now. It is close enough to what we have in Vertica, which does grow new partitions automatically though.

 

Thanks,

Brian

Announcements