HA cluster created using Microk8s having issue to recover in case of failure

ganeshgudghe · October 26, 2023, 6:12am

We have a HA cluster of 3 nodes created using Microk8s (1.27 installed from --classic channel). All the 3 nodes have Control Plane running on it (3 nodes are master nodes). All these 3 nodes have Ubuntu LTS 22.04 running.

This cluster of 3 nodes is formed by running microk8s add-node on say Node A. Once we get the command to join the cluster then we run join command from other 2 nodes say Node B and Node C. Now in this scenario following are our observartions:

In order to test HA, whenerver we restart Node B or Node C, then it gets recovered within 5-10 SECONDS causing no issues. That is all our application services(K8s pods) running on restarted node, recovers within 5-10 secs.
BUT, whenever we restart Node A (on which we executed add-node command), it takes more than 5-10 MINUTES to recover casuing the HA to fail. As all our application services(K8s pods) running on this restarted node takes more time to recover.

Question:

Why does it take 5-10 seconds in case case 1 ?
Why does it take 5-10 MINUTES in case of 2 ?

Are there any configuration parameters missing? Any urgent help would be greatly appreciated.

Topic		Replies	Views
Recovery of HA MicroK8s Clusters microk8s docs	2	8078	September 23, 2020
Few of our application pods in microk8s cluster are not coming up until reboot ubuntu machine microk8s	0	245	June 27, 2023
Advice Wanted: proper way to restart nodes microk8s	1	3085	June 1, 2020
High Availability (HA) microk8s docs	31	17454	October 23, 2024
Clustering with MicroK8s: adding node does not work with release > 1.18 microk8s	3	3264	March 4, 2021

HA cluster created using Microk8s having issue to recover in case of failure

Related topics