Out team has noticed, when we try to restart the cluster, the kudu tables replication or availability
is taking couple of hours, before we can actually start the Jobs, in relavance to Kudu tables loading or Processing.
Thank you for the report.
Could you be more specific and provide some information on the version of Kudu (or CDH) you are using? Also, how many replicas per tablet server and how much data per tablet server does the cluster have? In Kudu 1.6 and prior versions the re-replication process might take very long time on some scenarios involving a restart of a tablet server. That has improved dramatically since Kudu 1.7 once more robust replica management scheme was introduced.
The CDH version in the current environment is 5.11.2, consists of 20 Node cluster,
the kudu version is kudu 1.3.0,
we have 18 registered tablet servers - 3 masters.
16 Gb of maximum Hard Memory limit for data storage on each tablet server for KUDU
Kudu Tablet Server Block Cache Capacity of 1 GB.
Total Tablet Size On Disk Across Kudu Replicas is around 2 TB.
I would strongly recommend upgrading from your older version of Kudu because there have been many improvements to address the issues you are describing.
See the release notes for the Kudu releases after 1.3.0, many of these fixes will help you:
- The default size for Write Ahead Log (WAL) segments has been reduced from 64MB to 8MB. Additionally, in the case that all replicas of a tablet are fully up to date and data has been flushed from memory, servers will now retain only a single WAL segment rather than two. These changes are expected to reduce the average consumption of disk space on the configured WAL disk by 16x, as well as improve the startup speed of tablet servers by reducing the number and size of WAL segments that need to be re-read.
- The Maintenance Manager has been improved to improve utilization of the configured maintenance threads. Previously, maintenance work would only be scheduled a maximum of 4 times per second, but now maintenance work will be scheduled immediately whenever any configured thread is available. This can improve the throughput of write-heavy workloads.
- KUDU-2020 Fixed an issue where re-replication after a failure would proceed significantly slower than expected. This bug caused many tablets to be unnecessarily copied multiple times before successfully being considered re-replicated, resulting in significantly more network and IO bandwidth usage than expected. Mean time to recovery on clusters with large amounts of data is improved by up to 10x by this fix.
- Tablet server startup time has been improved significantly on servers containing large numbers of blocks.
- The strategy Kudu uses for automatically healing tablets which have lost a replica due to server or disk failures has been improved. The new re-replication strategy, or replica management scheme, first adds a replacement tablet replica before evicting the failed one. With the previous replica management scheme, the system first evicts the failed replica and then adds a replacement. The new replica management scheme allows for much faster recovery of tablets in scenarios where one tablet server goes down and then returns back shortly after 5 minutes or so. The new scheme also provides substantially better overall stability on clusters with frequent server failures. (see KUDU-1097).
- Introduced manual data rebalancer into the kudu CLI tool. The rebalancer can be used to redistribute table replicas among tablet servers. The rebalancer can be run via kudu cluster rebalance sub-command. Using the new tool, it’s possible to rebalance Kudu clusters of version 1.4.0 and newer.
(note: CDH 5.16.1 doesn't include everything new from Kudu 1.8.0, only a few things like the rebalancer, but CDH 5.15.1 includes everything from Kudu 1.7.0 and earlier)
If you can, upgrade to CDH 5.15.1 or CDH 5.16.1
There are also many other improvements unrelated to startup time that I have not called out here, such as greatly reducing the thread count, various optimizations, many other bug fixes, and lots of improvements for observability and operability.