- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Kudu range partitons extension
- Labels:
-
Apache Kudu
Created on 01-20-2018 09:56 AM - edited 09-16-2022 05:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a simple table with range partitions defined by upper and lower bounds.
CREATE TABLE work.sales_by_year (
year INT, sale_id INT, amount INT,
PRIMARY KEY (sale_id, year)
)
PARTITION BY RANGE (year) (
PARTITION VALUES < 2015,
PARTITION 2015 <= VALUES < 2016,
PARTITION 2016 <= VALUES
)
STORED AS KUDU;
So this table has three partitions:
+--------+-----------+----------+-------------------------------------------------+------------+
| # Rows | Start Key | Stop Key | Leader Replica | # Replicas |
+--------+-----------+----------+-------------------------------------------------+------------+
| -1 | 800007DF | host1:7050 | 3 |
| -1 | 800007DF | 800007E0 | host2:7050 | 3 |
| -1 | 800007E0 | | host3:7050 | 3 |
+--------+-----------+----------+-------------------------------------------------+------------+
Now I would like to end the last range with 2017 and have another interval for values >= 2017.
I tried multiple syntaxes, but it does not work:
alter table work.sales_by_year add range partition 2016 <= VALUES < 2017; Query: alter table work.sales_by_year add range partition 2016 <= VALUES < 2017 ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2016 <= VALUES < 2017
alter table work.sales_by_year add range partition VALUE = 2017; Query: alter table work.sales_by_year add range partition VALUE = 2017 ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2017 <= VALUES < 2018
These error messages are misleading, if I run show partitions, I am having still those three intervals, so no 2017 and 2018.
Any hints how to extend the range partitons?
Thanks
Created 01-29-2018 01:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So the correct answer is that:
Tables with range partitions defined via upper and lower boundaries cannot be extended.
Tables with partitions defined as a single value can be extended.
Created on 01-22-2018 09:24 PM - edited 01-22-2018 09:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error messages you're seeing seem to indicate that such alterations are not supported. AFAICT his would entail splitting tablets, which isn't supported at the moment. See here for more details on what's currently implemented: https://kudu.apache.org/docs/schema_design.html#range-partitioning-example
The doc gives two examples, noting that Example 2 is more flexible because it can add further partitions. Your partition schema more closely resembles Example 1, which can't.
Depending on how much data you have, you might consider creating a new table with a more flexible partition schema (e.g. 2014, 2015, 2016, 2017, a la Example 2 in the linked docs), and re-insert into this new table from the existing table.
Created 01-28-2018 05:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is confusing, Apache Kudu User Guide, p.27:
Partitioning Limitations • Tables must be manually pre-split into tablets using simple or compound primary keys. Automatic splitting is not yet possible. Range partitions may be added or dropped after a table has been created. See Schema Design for more information.
Created 01-28-2018 05:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And also on p.29:
New Features in Kudu 0.10.0 • Users may now manually manage the partitioning of a range-partitioned table. When a table is created, the user may specify a set of range partitions that do not cover the entire available key space. A user may add or drop range partitions to existing tables. This feature can be particularly helpful with time series workloads in which new partitions can be created on an hourly or daily basis. Old partitions may be efficiently dropped if the application does not need to retain historical data past a certain point.
Created 01-29-2018 01:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So the correct answer is that:
Tables with range partitions defined via upper and lower boundaries cannot be extended.
Tables with partitions defined as a single value can be extended.
Created 01-29-2018 11:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, it's actually the opposite:
- Tables with bounded range partitions defined with upper/lower bounds can have partitions added to it.
- Tables with unbounded range partitions defined with a single value cannot be extended in today's Kudu.
From the range partitioning docs:
"The second example [with upper/lower bounds specified] is more flexible than the first [with a split defined], because it allows range partitions for future years to be added to the table. In the first example, all writes for times after 2016-01-01 will fall into the last partition, so the partition may eventually become too large for a single tablet server to handle."
