Hi guys
My question is related to the following two metrics:
- kudu_on_disk_data_size [Space used by this tablet's data blocks.] -> 1494MB
- kudu_on_disk_size [Size of this tablet on disk.] -> 3010 MB
I've verified those two metrics for one example tablet. My question now, the kudu_on_disk_size makes sense in a way that this is what I see as well with "du" on linux. However, how is it possible that kudu_on_disk_size is in my example twice as big as kudu_on_disk_data_size? What kind of data is additionally saved on disk beside naked data?
A small hint regarding the data on this tablet, I'm using a schema with 8 primary keys (all Integers) out of 21 columns.
What I can say is, the kudu_on_disk_data_size metric size is more or less the same as the size for the same data in parquet format. At least that makes sense for me.
Thanks in advance