Reply
Highlighted
Master
Posts: 326
Registered: ‎07-01-2015
Accepted Solution

Kudu range partitons extension

Hi,

 I have a simple table with range partitions defined by upper and lower bounds.

 

CREATE TABLE work.sales_by_year (
year INT, sale_id INT, amount INT,
PRIMARY KEY (sale_id, year)
)
PARTITION BY RANGE (year) (
PARTITION VALUES < 2015,
PARTITION 2015 <= VALUES < 2016,
PARTITION 2016 <= VALUES
)
STORED AS KUDU;

 

So this table has three partitions:

+--------+-----------+----------+-------------------------------------------------+------------+
| # Rows | Start Key | Stop Key | Leader Replica | # Replicas |
+--------+-----------+----------+-------------------------------------------------+------------+
| -1      | 800007DF                   | host1:7050 | 3 |
| -1      | 800007DF | 800007E0 | host2:7050 | 3 |
| -1      | 800007E0 |                   | host3:7050 | 3 |
+--------+-----------+----------+-------------------------------------------------+------------+

 

Now I would like to end the last range with 2017 and have another interval for values >= 2017.

 

I tried multiple syntaxes, but it does not work:

 

alter table work.sales_by_year add range partition 2016 <= VALUES < 2017;
Query: alter table work.sales_by_year add range partition 2016 <= VALUES < 2017
ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year
CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2016 <= VALUES < 2017

 

alter table work.sales_by_year add range partition VALUE = 2017;
Query: alter table work.sales_by_year add range partition VALUE = 2017
ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year
CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2017 <= VALUES < 2018

 

These error messages are misleading, if I run show partitions, I am having still those three intervals, so no 2017 and 2018.

 

Any hints how to extend the range partitons?

Thanks

Cloudera Employee
Posts: 20
Registered: ‎02-23-2017

Re: Kudu range partitons extension

[ Edited ]

The error messages you're seeing seem to indicate that such alterations are not supported. AFAICT his would entail splitting tablets, which isn't supported at the moment. See here for more details on what's currently implemented: https://kudu.apache.org/docs/schema_design.html#range-partitioning-example

 

The doc gives two examples, noting that Example 2 is more flexible because it can add further partitions. Your partition schema more closely resembles Example 1, which can't.

 

Depending on how much data you have, you might consider creating a new table with a more flexible partition schema (e.g. 2014, 2015, 2016, 2017, a la Example 2 in the linked docs), and re-insert into this new table from the existing table.

Master
Posts: 326
Registered: ‎07-01-2015

Re: Kudu range partitons extension

It is confusing, Apache Kudu User Guide, p.27:

 

Partitioning Limitations • Tables must be manually pre-split into tablets using simple or compound primary keys. Automatic splitting is not yet possible. Range partitions may be added or dropped after a table has been created. See Schema Design for more information.

 

 

Master
Posts: 326
Registered: ‎07-01-2015

Re: Kudu range partitons extension

And also on p.29:

 

New Features in Kudu 0.10.0 • Users may now manually manage the partitioning of a range-partitioned table. When a table is created, the user may specify a set of range partitions that do not cover the entire available key space. A user may add or drop range partitions to existing tables. This feature can be particularly helpful with time series workloads in which new partitions can be created on an hourly or daily basis. Old partitions may be efficiently dropped if the application does not need to retain historical data past a certain point.

Master
Posts: 326
Registered: ‎07-01-2015

Re: Kudu range partitons extension

So the correct answer is that:

 Tables with range partitions defined via upper and lower boundaries cannot be extended.

 Tables with partitions defined as a single value can be extended.

 

Cloudera Employee
Posts: 20
Registered: ‎02-23-2017

Re: Kudu range partitons extension

No, it's actually the opposite:

  • Tables with bounded range partitions defined with upper/lower bounds can have partitions added to it.
  • Tables with unbounded range partitions defined with a single value cannot be extended in today's Kudu.

 

From the range partitioning docs:

"The second example [with upper/lower bounds specified] is more flexible than the first [with a split defined], because it allows range partitions for future years to be added to the table. In the first example, all writes for times after 2016-01-01 will fall into the last partition, so the partition may eventually become too large for a single tablet server to handle."

Announcements