Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala/Kudu non-covering range partition to support rolling window data retention

Solved Go to solution
Highlighted

Impala/Kudu non-covering range partition to support rolling window data retention

Explorer

Hi Impala/Kudu gurus,

 

I'm extremely excited by the new Impala/Kudu release that supports non-covering range partition, as described here: https://github.com/cloudera/kudu/blob/master/docs/design-docs/non-covering-range-partitions.md

and here: https://gerrit.cloudera.org/#/c/4856/

 

Yet I haven't figured out how exactly to use it to support rolling window data retention that our business needs. The syntax descibed in the 2nd document above still seems to require static partition specification.

 

What we need is the ability to auto-create new partitions based on a timestamp expression so that each partition contains x days of data only. We then can drop the old partitions based on our data retention policy on a per table basis.

 

As a comparison, the similar function is provided by Oracle's range interval partition:

 

 

PARTITION BY RANGE (CREATION_DATE)
INTERVAL (NUMTODSINTERVAL(7, 'DAY'))

and Vertica's partition key expression:

 

PARTITION BY (floor((((tbl.creation_ts)::date - '0001-12-31 BC'::date) / 3)))

 

Thanks,

Brian

2 ACCEPTED SOLUTIONS

Accepted Solutions

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Cloudera Employee

Hi Brian,

 

Unfortunately Kudu partitions must be pre-defined as you suspected, so the Oracle syntax you described won't work for Impala. However, you can add and drop range partitions even after the table is created, so you can manually add the next hour/day/week partition, and drop some historical partition. The syntax is described in the latest version of the CDH documentation:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_kudu.html#kudu_range_partitio...

 

Best,

Matt

View solution in original post

Highlighted

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Explorer

Hi Matt,

 

It seems we are going down this path for now. It is close enough to what we have in Vertica, which does grow new partitions automatically though.

 

Thanks,

Brian

View solution in original post

2 REPLIES 2

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Cloudera Employee

Hi Brian,

 

Unfortunately Kudu partitions must be pre-defined as you suspected, so the Oracle syntax you described won't work for Impala. However, you can add and drop range partitions even after the table is created, so you can manually add the next hour/day/week partition, and drop some historical partition. The syntax is described in the latest version of the CDH documentation:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_kudu.html#kudu_range_partitio...

 

Best,

Matt

View solution in original post

Highlighted

Re: Impala/Kudu non-covering range partition to support rolling window data retention

Explorer

Hi Matt,

 

It seems we are going down this path for now. It is close enough to what we have in Vertica, which does grow new partitions automatically though.

 

Thanks,

Brian

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here