Created on 01-07-2019 10:44 PM - edited 09-16-2022 07:02 AM
We have a client who wants to setup a small separate special purpose cluster. It will consist of 5 nodes and they want to run Kudu on it. If we configure with 3 Master nodes and 3 Tablet Servers, at least one of the nodes will have both Master roles as well as Tablet Server role. In any case, in this scenario, if hypothetically the node with both roles fail, and the replication is set to 3, neither the master tablet nor any of the table tablets will be able to satisfy the replication factor since there are now only 2 master roles and 2 table servers available. Since this is still within the (n-1)/2 failures, will the service still run? After 5 minutes, and when the nodes are determined to be permanent failures, a new follower needs to be created but since there are only 2 tablet servers still functioning, what will happen? Will the system limp along, still functioning or will it fail?
Thanks in advance for any help.
Created 01-08-2019 10:46 AM
The master role generally takes few resources, and it isn't uncommon to colocate a Tablet Server and Master. Additionally, it's is strongly encouraged to have at least four Tablet Servers for a minimal deployment for higher availability in the case of failures (see the note about KUDU-1097 here). That said, with five nodes, you could have three Masters and five Tablet Servers. Even still, five Tablet Servers isn't huge, so please take a look at the Scaling Limits and Scaling Guide to ensure you sufficiently provision your cluster.
If a node with both roles fails, the following will happen:
On the Master side, if the failed Master was a leader Master, a new leader will be elected between the remaining two Masters and business will continue as usual. No automatic recovery will happen to bring up a new Master, and these steps should be followed to recover the failed Master when convenient.
On the Tablet Server side, the tablet replicas that existed on the Tablet Server role will automatically be re-replicated to the remaining four Tablet Servers. If you went with three Tablet Servers, this would not happen, since the remaining two Tablet Servers would already have replicas of every tablet, and the cluster would be stuck with every tablet having two copies instead of three. The service would still function, but another failure would render every tablet unavailable.
Created 01-09-2019 07:39 PM
In terms of resources, five masters wouldn't strain the cluster much. The big change is that the state that lives on the master (e.g. catalog metadata, etc.) would need to be replicated with a replication factor of 5 in mind (i.e. at least 3 copies to be considered "written"). While this is possible, the recommended configuration is 3. It is the most well-tested and commonly-used.
Created 01-08-2019 10:46 AM
The master role generally takes few resources, and it isn't uncommon to colocate a Tablet Server and Master. Additionally, it's is strongly encouraged to have at least four Tablet Servers for a minimal deployment for higher availability in the case of failures (see the note about KUDU-1097 here). That said, with five nodes, you could have three Masters and five Tablet Servers. Even still, five Tablet Servers isn't huge, so please take a look at the Scaling Limits and Scaling Guide to ensure you sufficiently provision your cluster.
If a node with both roles fails, the following will happen:
On the Master side, if the failed Master was a leader Master, a new leader will be elected between the remaining two Masters and business will continue as usual. No automatic recovery will happen to bring up a new Master, and these steps should be followed to recover the failed Master when convenient.
On the Tablet Server side, the tablet replicas that existed on the Tablet Server role will automatically be re-replicated to the remaining four Tablet Servers. If you went with three Tablet Servers, this would not happen, since the remaining two Tablet Servers would already have replicas of every tablet, and the cluster would be stuck with every tablet having two copies instead of three. The service would still function, but another failure would render every tablet unavailable.
Created 01-08-2019 04:41 PM
Created 01-08-2019 04:48 PM
That's correct, it is a manual process to get back up to three Masters, though the remaining two-Master deployment will still be useable until then. KUDU-2181 would improve this, but no one is working on it right now, AFAIK.
Created 01-08-2019 08:55 PM
In such a small cluster I'd definitely consider doubling up masters and tservers on all of the master nodes (ie 3 masters and 5 tservers). The master is pretty light weight and can be colocated with tservers for such a small workload. This way you'll get better fault tolerance and also better performance vs using 2/5 of the nodes mostly unutilized.
-Todd
Created 01-09-2019 04:38 PM
Created 01-09-2019 07:39 PM
In terms of resources, five masters wouldn't strain the cluster much. The big change is that the state that lives on the master (e.g. catalog metadata, etc.) would need to be replicated with a replication factor of 5 in mind (i.e. at least 3 copies to be considered "written"). While this is possible, the recommended configuration is 3. It is the most well-tested and commonly-used.