- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
kudu compaction did not run
- Labels:
-
Apache Kudu
Created on ‎12-26-2018 11:48 PM - edited ‎09-16-2022 07:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
kudu1.7.0 in cdh 5.15
3 master nodes, 4c32g, ubuntu16.04
3 data nodes, 8c64g, 1.8T ssd, ubuntu16.04
Here is a table : project_construction_record, 62 columns, 170k records, no partition
The table has many crud operations every day
I run a simple sql on it (using impala):
SELECT * FROM project_construction_record ORDER BY id LIMIT 1
it takes 7 seconds!
By checking the profile, I found this:
KUDU_SCAN_NODE (id=0) (6.06s)
- BytesRead: 0 byte
- CollectionItemsRead: 0
- InactiveTotalTime: 0 ns
- KuduRemoteScanTokens: 0
- NumScannerThreadsStarted: 1
- PeakMemoryUsage: 3.4 MB
- RowsRead: 177,007
- RowsReturned: 177,007
- RowsReturnedRate: 29188/s
- ScanRangesComplete: 1
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 6.09s
- MaterializeTupleTime
: 6.06s
- ScannerThreadsSysTime: 48ms
- ScannerThreadsUserTime: 172ms
So i check the scan of this sql, and found this:
column | cells read | bytes read | blocks read |
id | 176.92k | 1.91M | 19.96k |
org_id | 176.92k | 1.91M | 19.96k |
work_date | 176.92k | 2.03M | 19.96k |
description | 176.92k | 1.21M | 19.96k |
user_name | 176.92k | 775.9K | 19.96k |
spot_name | 176.92k | 825.8K | 19.96k |
spot_start_pile | 176.92k | 778.7K | 19.96k |
spot_end_pile | 176.92k | 780.4K | 19.96k |
...... | ...... | ...... | ...... |
There are so many blocks read.
Then I run the kudu fs list command, and I got a 70M report data, here is the bottom:
0b6ac30b449043a68905e02b797144fc | 25024 | 40310988 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310989 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310990 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310991 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310992 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310993 | column 0b6ac30b449043a68905e02b797144fc | 25024 | 40310996 | undo 0b6ac30b449043a68905e02b797144fc | 25024 | 40310994 | bloom 0b6ac30b449043a68905e02b797144fc | 25024 | 40310995 | adhoc-index
there are 25024 rowsets, and more than 1m blocks in the tablet
I left the maintenance and the compact flags by default, only change the tablet_history_max_age_sec to one day:
--maintenance_manager_history_size=8 --maintenance_manager_num_threads=1 --maintenance_manager_polling_interval_ms=250 --budgeted_compaction_target_rowset_size=33554432 --compaction_approximation_ratio=1.0499999523162842 --compaction_minimum_improvement=0.0099999997764825821 --deltafile_default_block_size=32768 --deltafile_default_compression_codec=lz4 --default_composite_key_index_block_size_bytes=4096 --tablet_delta_store_major_compact_min_ratio=0.10000000149011612 --tablet_delta_store_minor_compact_max=1000 --mrs_use_codegen=true --compaction_policy_dump_svgs_pattern= --enable_undo_delta_block_gc=true --fault_crash_before_flush_tablet_meta_after_compaction=0 --fault_crash_before_flush_tablet_meta_after_flush_mrs=0 --max_cell_size_bytes=65536 --max_encoded_key_size_bytes=16384 --tablet_bloom_block_size=4096 --tablet_bloom_target_fp_rate=9.9999997473787516e-05 --tablet_compaction_budget_mb=128 --tablet_history_max_age_sec=86400
It is a production enviroment, and many other tables have same issue, the performance is getting slower and slower.
So my question is:
why the compaction does not run? is it a bug? and can i do compact manually?
Created ‎12-27-2018 12:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi huaj,
It looks like you are hitting KUDU-1400, which before the fix Kudu cmpacts rowsets based on overlap and not based on other criteria like on-disk size.
Unfortunately, there is no way to fix the small rowsets have been flushed. On the other hand, You can rebuild the affected tables, create new tables from this existing tables and see if that helps. Before doing that, please check this doc to see which patten usage pattern caused you to hit this issue and try to prevent that by following the recommendations.
FYI, this fix for KUDU-1400 should land in the next release CDH6.2.
Created ‎03-08-2019 06:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you set it to 24 hours, right? we also set it to 24 hours - at least we do not have 100,000s Disk rowsets but we just started last week.
Created ‎03-08-2019 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes
if some table have many operation (insert / update) in 24 hours, the query performance go down significantly
mrs is not col-storage i guess

- « Previous
-
- 1
- 2
- Next »