Support Questions

Find answers, ask questions, and share your expertise

The number of partitions recommended in Kudu

avatar
New Contributor
The current cluster has 7 tablet servers and 448 CPU cores. It is planned to set Hash partitions to cooperate with Range partitions. The Range partition divides the ten year data into 120 partitions by month. How many Hash partitions based on other fields is appropriate? In addition, the main read/write scenario is a large number of reads/writes based on the current month's data.
1 ACCEPTED SOLUTION

avatar
Contributor

By default, only 60 tablets (partitions) can be created at table creation time.  See the following limitation:

https://kudu.apache.org/docs/known_issues.html#_scale

But you can add any number of tablets later on by adding range partitions once a table is created. (hash partitions will not be allowed to add after table creation)

 

so you can not create 120 partition during table creation, but since you need hash partition, so I suggest you create 20 range partition multipled with 3 hash partition (need to use primary key columns for hash partition), so in this way, you create a 60 tablets table,  then after table creation, you can add more range partition by using "alter table" command from impala.

 

For the question about the number of hash partitions for a kudu table, there is no such formula, the purpose of hash parittion is just try to distribute rows by hash value and spread writes randomly among tablets, which helps mitigate hot-spotting and uneven tablet sizes.

View solution in original post

2 REPLIES 2

avatar
Contributor

By default, only 60 tablets (partitions) can be created at table creation time.  See the following limitation:

https://kudu.apache.org/docs/known_issues.html#_scale

But you can add any number of tablets later on by adding range partitions once a table is created. (hash partitions will not be allowed to add after table creation)

 

so you can not create 120 partition during table creation, but since you need hash partition, so I suggest you create 20 range partition multipled with 3 hash partition (need to use primary key columns for hash partition), so in this way, you create a 60 tablets table,  then after table creation, you can add more range partition by using "alter table" command from impala.

 

For the question about the number of hash partitions for a kudu table, there is no such formula, the purpose of hash parittion is just try to distribute rows by hash value and spread writes randomly among tablets, which helps mitigate hot-spotting and uneven tablet sizes.

avatar
New Contributor

Thank you very much for your answer, which made me realize that my understanding of kudu partition tables is very inadequate, especially in terms of usage restrictions. Once again, I would like to express my gratitude to you and wish you a happy life.