Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kudu range partitons extension

Solved Go to solution
Highlighted

Kudu range partitons extension

Master Collaborator

Hi,

 I have a simple table with range partitions defined by upper and lower bounds.

 

CREATE TABLE work.sales_by_year (
year INT, sale_id INT, amount INT,
PRIMARY KEY (sale_id, year)
)
PARTITION BY RANGE (year) (
PARTITION VALUES < 2015,
PARTITION 2015 <= VALUES < 2016,
PARTITION 2016 <= VALUES
)
STORED AS KUDU;

 

So this table has three partitions:

+--------+-----------+----------+-------------------------------------------------+------------+
| # Rows | Start Key | Stop Key | Leader Replica | # Replicas |
+--------+-----------+----------+-------------------------------------------------+------------+
| -1      | 800007DF                   | host1:7050 | 3 |
| -1      | 800007DF | 800007E0 | host2:7050 | 3 |
| -1      | 800007E0 |                   | host3:7050 | 3 |
+--------+-----------+----------+-------------------------------------------------+------------+

 

Now I would like to end the last range with 2017 and have another interval for values >= 2017.

 

I tried multiple syntaxes, but it does not work:

 

alter table work.sales_by_year add range partition 2016 <= VALUES < 2017;
Query: alter table work.sales_by_year add range partition 2016 <= VALUES < 2017
ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year
CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2016 <= VALUES < 2017

 

alter table work.sales_by_year add range partition VALUE = 2017;
Query: alter table work.sales_by_year add range partition VALUE = 2017
ERROR: ImpalaRuntimeException: Error adding range partition in table sales_by_year
CAUSED BY: NonRecoverableException: New range partition conflicts with existing range partition: 2017 <= VALUES < 2018

 

These error messages are misleading, if I run show partitions, I am having still those three intervals, so no 2017 and 2018.

 

Any hints how to extend the range partitons?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Kudu range partitons extension

Master Collaborator

So the correct answer is that:

 Tables with range partitions defined via upper and lower boundaries cannot be extended.

 Tables with partitions defined as a single value can be extended.

 

View solution in original post

5 REPLIES 5

Re: Kudu range partitons extension

Contributor

The error messages you're seeing seem to indicate that such alterations are not supported. AFAICT his would entail splitting tablets, which isn't supported at the moment. See here for more details on what's currently implemented: https://kudu.apache.org/docs/schema_design.html#range-partitioning-example

 

The doc gives two examples, noting that Example 2 is more flexible because it can add further partitions. Your partition schema more closely resembles Example 1, which can't.

 

Depending on how much data you have, you might consider creating a new table with a more flexible partition schema (e.g. 2014, 2015, 2016, 2017, a la Example 2 in the linked docs), and re-insert into this new table from the existing table.

Highlighted

Re: Kudu range partitons extension

Master Collaborator

It is confusing, Apache Kudu User Guide, p.27:

 

Partitioning Limitations • Tables must be manually pre-split into tablets using simple or compound primary keys. Automatic splitting is not yet possible. Range partitions may be added or dropped after a table has been created. See Schema Design for more information.

 

 

Highlighted

Re: Kudu range partitons extension

Master Collaborator

And also on p.29:

 

New Features in Kudu 0.10.0 • Users may now manually manage the partitioning of a range-partitioned table. When a table is created, the user may specify a set of range partitions that do not cover the entire available key space. A user may add or drop range partitions to existing tables. This feature can be particularly helpful with time series workloads in which new partitions can be created on an hourly or daily basis. Old partitions may be efficiently dropped if the application does not need to retain historical data past a certain point.

Highlighted

Re: Kudu range partitons extension

Master Collaborator

So the correct answer is that:

 Tables with range partitions defined via upper and lower boundaries cannot be extended.

 Tables with partitions defined as a single value can be extended.

 

View solution in original post

Highlighted

Re: Kudu range partitons extension

Contributor

No, it's actually the opposite:

  • Tables with bounded range partitions defined with upper/lower bounds can have partitions added to it.
  • Tables with unbounded range partitions defined with a single value cannot be extended in today's Kudu.

 

From the range partitioning docs:

"The second example [with upper/lower bounds specified] is more flexible than the first [with a split defined], because it allows range partitions for future years to be added to the table. In the first example, all writes for times after 2016-01-01 will fall into the last partition, so the partition may eventually become too large for a single tablet server to handle."

Don't have an account?
Coming from Hortonworks? Activate your account here