Reply
Explorer
Posts: 13
Registered: ‎05-19-2018

Kudu Tablet Sizing Recommendation

[ Edited ]

Hi guys

 

for one of our use cases we have about 30TB data compressed in parquet. we are testing now kudu and I'm asking myself how big one tablet should be to get best performance (write and query) out of it. Is there any recommendation eg. 1GB per tablet size? Because what we see is that as bigger the tablet gets as slower seems to be the inserting. However 1GB is way to small as we would need 15 servers (30'000GB / 2000 [max number of tablets per server as written in doc] -> 15) without taking into account the replication. Additionally the doc recommends not to use more than 100 servers...

 

We are working with kudu 1.6.0.

 

Thanks in advance

Highlighted
Cloudera Employee
Posts: 70
Registered: ‎04-08-2014

Re: Kudu Tablet Sizing Recommendation

Hi, we have no particular guidance for maximum tablet size. If you are ingesting in random order this will hurt performance, if you can write in sorted primary key order that will help. Otherwise Kudu will constantly be working in the background to merge and compact the rows you wrote into non-overlapping contiguous RowSets.

 

Another thing you can do if you cannot write in PK sorted order is insert slower, to give Kudu time to "catch up" when reorganizing the data on disk. Your inserts should get faster again after some time.