Created 05-26-2023 01:58 PM
hello cloudera community,
we check in the graph "total_kudu_on_disk_size_across_kudu_replicas" and there are tables with 500GB
with that, we need to know what is the recommended size for a kudu table?
Created 05-30-2023 11:08 AM
Correct, 50GB is the limit, reccomended: 10GB 🙂
Created 05-29-2023 05:37 AM
Hi,
I did not get the recommended size of kudu table, But there is a limitation like what is amount of data per tablet, how many tablets per table etc.. Please refer the below documentation:
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/kudu_limitations.html#scaling_limits
Regards,
Chethan YM
Created 05-29-2023 02:01 PM
hi @ChethanYM ,
I read this documentation, but the doubt is about the tablet and table
if looking at the graph in cloudera and seeing tables above 50GB it would be out of the recommended
Created 05-30-2023 02:52 AM
Here the key is not the table size but the tablets.
One table of 50GB could have 50 tablets, then each tablet of 1GB (that's good)
or
One table of 50GB could have 2 tablets, then each tablet of 25GB (that's no so good: The recommended target size for tablets is under 10 GiB)
you can take a look in your Kudu Master UI: http://Master:8051/tables and look for your tables and partitions (tablets).
I'm using this chart to see the kudu table sizing in the clart builder:
select total_kudu_on_disk_size_across_kudu_replicas where category=KUDU_TABLE
Created 05-30-2023 05:35 AM
hi @Juanes ,
great!
So, let's assume I have a 500GB table and that table was created with 240 tablets, would that value be within the recommended range?
other point!
I'm using the following calculates as an example:
DATA_SIZE = (value taken from the graph "total_kudu_on_disk_size_across_kudu_replicas")
NUM_REPLICAS = RF * Total Tablets (value taken from the ksck command)
TABLET_SIZE = DATA_SIZE / NUM_REPLICAS
DATA_SIZE = 147G (converted to bytes, getting "157840048128")
NUM_REPLICAS = 3 * 240 = 360
Name | RF | State | Total Tablets | Healthy | Recovering | Underreplicated | not available
impala::DATABASE01.TABLE01 | 3 | HEALTHY | 240 | 240 | 0 | 0 | 0
TABLET_SIZE = 157840048128 / 720 = 219222289 (which equals 2.04GB)
the end result was 2.04GB, does that mean each tablet has 2.04GB?
Created 05-30-2023 07:25 AM
Hi again,
you should be able to see the tablet size of every table (in the Kudu Tablet server UI):
http://KUDUTABLET1:8050/tablets
Then go to "Tablets" in the top menu and then you can search in the empty box your desired table.
you will see all tablets (blocks) and some interesting information like
Tablet ID, Partition, State, On-disk size and RaftConfig (Master)
Then you can see how the tablets are more or less similar size.
Created 05-30-2023 11:00 AM
hi @Juanes,
accessing a tablet server on port 8050 and checking the tablets, found more than 30 tablet id with the same name, each tablet id was using 4.4GB disk size
in this case, according to cloudera's documentation, a tablet can have a maximum of 50GB, the size of the tablet id that was found is within the recommended range, right?
Created 05-30-2023 11:08 AM
Correct, 50GB is the limit, reccomended: 10GB 🙂
Created 05-30-2023 11:09 AM
OK! @Juanes 😉
thanks for the clarification.