Created 11-25-2023 07:19 AM
Hello,
I'm running a Kudu 1.17 cluster with 1 master and 1 tserver configured with the following flags:
--block_cache_capacity_mb=256
--use_hybrid_clock=false
--fs_data_dirs=/mnt/data0/data
--fs_metadata_dir=/mnt/data0/metadata
--fs_wal_dir=/mnt/data0/tserver
--log_container_metadata_runtime_compact=true
--rpc_dump_all_traces=true
--rpc_service_queue_length=200
--rpc_authentication=disabled
--rpc_encryption=disabled
--webserver_doc_root=/opt/kudu/www
--tablet_history_max_age_sec=300
--tserver_master_addrs=kudu-masters-0.kudu-masters-db-services-headless:7051
--heap_profile_path=/tmp/kudu.1
--unlock_experimental_flags=true
--unlock_unsafe_flags=true
--memory_limit_hard_bytes=2073741824
--stderrthreshold=0
I'm currently running 878 tablets on that single tserver.
For my workload, I'm executing around 10 concurrent scans via python kudu client on different tables, all having a partition by hash on the primary key. On the background, some write operations are performed from time to time, inserting just a little number of rows every time.
Frequently I incur into slow scans on one of the table, having a little number of rows, around 10 per partition.
By checking logs and RPC traces, it seems that a lot of time is spent in the operation tracked with the metric lbm_read_time_us:
{
"remote_ip": "10.42.1.32:44266",
"state": "OPEN",
"remote_user_credentials": "{username='root'}",
"calls_in_flight": [
{
"header": {
"call_id": 2,
"remote_method": {
"service_name": "kudu.tserver.TabletServerService",
"method_name": "Scan"
},
"timeout_millis": 29999,
"required_feature_flags": [
1
]
},
"micros_elapsed": 14871632
}
],
"socket_stats": {
"rtt": 1379,
"rttvar": 1442,
"snd_cwnd": 10,
"total_retrans": 0,
"pacing_rate": 20273724,
"max_pacing_rate": 18446744073709551615,
"bytes_acked": 760,
"bytes_received": 3507,
"segs_out": 12,
"segs_in": 16,
"send_queue_bytes": 0,
"receive_queue_bytes": 0,
"send_bytes_per_sec": 10137781
},
"transport_details": {
"tcp": {
"max_segment_size": 1398
}
}
}
I20231125 14:56:01.833339 109 rpcz_store.cc:269] Call kudu.tserver.TabletServerService.Scan from 10.42.1.32:44266 (request call id 2) took 16772ms. Trace:
I20231125 14:56:01.833528 109 rpcz_store.cc:270] 1125 14:55:45.061275 (+ 0us) service_pool.cc:169] Inserting onto call queue
1125 14:55:45.061332 (+ 57us) service_pool.cc:228] Handling call
1125 14:55:45.061432 (+ 100us) tablet_service.cc:2763] Created scanner 547f841e3efe42ff9c49896ceb28945a for tablet 4aa9943027b9479cb38ea1bfa543aa75
1125 14:55:45.061668 (+ 236us) tablet_service.cc:2903] Creating iterator
1125 14:55:45.061687 (+ 19us) tablet_service.cc:2928] Iterator created
1125 14:55:45.084323 (+ 22636us) tablet_service.cc:2950] Iterator init: OK
1125 14:55:45.084333 (+ 10us) tablet_service.cc:2993] has_more: true
1125 14:55:45.084346 (+ 13us) tablet_service.cc:3010] Continuing scan request
1125 14:55:45.084394 (+ 48us) tablet_service.cc:3073] Found scanner 547f841e3efe42ff9c49896ceb28945a for tablet 4aa9943027b9479cb38ea1bfa543aa75
1125 14:56:01.831663 (+16747269us) tablet_service.cc:3147] Deadline expired - responding early
1125 14:56:01.833323 (+ 1660us) inbound_call.cc:173] Queueing success response
Metrics: {"cfile_cache_hit":2,"cfile_cache_hit_bytes":78,"cfile_cache_miss":49,"cfile_cache_miss_bytes":1393,"cfile_init":3827,"delta_iterators_relevant":10572,"lbm_read_time_us":16386006,"lbm_reads_1-10_ms":2121,"lbm_reads_10-100_ms":322,"lbm_reads_gt_100_ms":3,"lbm_reads_lt_1ms":12911,"rowset_iterators":19,"scanner_bytes_read":665}
This is not consistent, by repeating the workload I see querying going from ms to s without a clear reason. It seems that Kudu is spending a lot of time trying to read data via the log block manager. But I don't know how to debug that, nor I know where to look to understand what is the problem.
Do you have any suggestions on why this is happening?
Thanks,
Dario
Created 11-15-2024 02:02 AM
HI Team, if you have 1 TS which is using for read and write then its obvisous to see the slowness from disk I/O as read/write both are taking the BW, You should be checking the kuud TS logs if you seeing any backpressure, soft memory exceed, any timeout, ancestral memory these all parameter will tell you if there is overload to TS and you can increase the parameter as needed, Please remmember running 1 TS with 848 tablets is fine but you should atleast have 3 TS so that one TS will not do all the work