We have a HA cluster of 3 nodes created using Microk8s (1.27 installed from --classic channel). All the 3 nodes have Control Plane running on it (3 nodes are master nodes). All these 3 nodes have Ubuntu LTS 22.04 running.
This cluster of 3 nodes is formed by running microk8s add-node on say Node A. Once we get the command to join the cluster then we run join command from other 2 nodes say Node B and Node C. Now in this scenario following are our observartions:
- In order to test HA, whenerver we restart Node B or Node C, then it gets recovered within 5-10 SECONDS causing no issues. That is all our application services(K8s pods) running on restarted node, recovers within 5-10 secs.
- BUT, whenever we restart Node A (on which we executed add-node command), it takes more than 5-10 MINUTES to recover casuing the HA to fail. As all our application services(K8s pods) running on this restarted node takes more time to recover.
- Why does it take 5-10 seconds in case case 1 ?
- Why does it take 5-10 MINUTES in case of 2 ?
Are there any configuration parameters missing? Any urgent help would be greatly appreciated.