Created 01-27-2021 11:10 AM
I deleted some data from 100+ tables. but i don't any changes in kudu disk usage and kudu soft memory,
Created 01-27-2021 10:26 PM
Hello @sam8686
Thanks for using Cloudera Community. Based on the Post, Your team deleted data from 100+ Tables yet there is no change in Disk Usage. It's likely Compaction must run first before any Space Usage reduction is reflected. There isn't any way to manually run Compaction in Kudu.
Review Link [1] Section "4.10 RowSet Compaction" reporting "We take this opportunity to remove deleted rows".
- Smarak
Created 01-27-2021 11:37 PM
Adding to @smdas
This is one of the kudu limitations :-
"There is no way to run compaction manually, but dropping the table will reclaim the space immediately."
You can verify the size from CM graphs:-
reference :- http://apache.github.io/kudu/docs/known_issues.html#_other_usage_limitations
Created 01-28-2021 10:14 AM
I followed the steps for table chart. I didn't find big difference. there is one table with no data but still i can see 400 MB space for that table. I deleted table data using IMAPALA.
Created 02-03-2021 03:32 AM
Ideally if you have dropped the table then the data should get deleted immediately. The metrics in CM may take some time to reflect, we can verify from backend if the table is actually deleted.
Verify if the table still exist in kudu FS. You can verify this by using kudu ksck command with -tables flags :-
kudu cluster ksck <master_addresses> -tables=<tables>
Note if the table created through impala use "impala::db.tablename"
If you see the table in ksck then run below command to delete the table from kudu:-
kudu table delete <master_addresses> <table_name>
Created 02-03-2021 06:49 AM
I didn't delete whole table. I deleted some data. when i check in kudu tablet server UI, i see lot of TABLET_DATA_TOMBSTONED. In logs there are Processing DeleteTablet for tablet.
If the data is deleted, why it still caching tablet ?
Created 02-03-2021 08:41 AM
The disk space occupied by a deleted row is only reclaimable via compaction and given you have deleted some data and if the space is not reclaimed then probably you are hitting the bug
https://issues.apache.org/jira/browse/KUDU-1625
The jira stands unresolved. However if the goal is to delete the data and reclaim disk space, then you can drop partition (if range partition) in order to reclaim space.
Tombstone tablets have all their data removed from disk and don't consume significant resources. These tablet are necessary for correct operation of kudu.