Kubernetes routes traffic to unreachable pods after DC failure when control plane loses quorum

chaoos · March 16, 2026, 11:12am

Kubernetes version: v1.35.0
Host OS: Red Hat Enterprise Linux 9.6
CNI and version: Calico
CRI and version: containerd://2.2.1

==========

I have a Kubernetes cluster stretched across two data centers.

In DC1 there are 4 worker nodes and 2 master nodes.
In DC2 there are 4 worker nodes and 1 master node.

We performed an HA test. During a simulated failure of DC1, the external load balancer correctly redirected all traffic to DC2.

However, we observed that about 50% of the traffic processed through ClusterIP services on the worker nodes in DC2 was still being forwarded to nodes/pods in DC1, which were already unavailable.

As a result, the application became unstable because roughly half of the backend traffic was impacted.

This seems to happen because the Kubernetes control plane loses quorum (only one master remains in DC2), which prevents updates to the cluster state, and therefore kube-proxy continues to route traffic to endpoints that belong to DC1.

How can this behavior be eliminated? Specifically, how can we ensure that traffic is not routed to unavailable nodes/pods when the kube-apiserver is unreachable due to loss of quorum (only one master remaining in DC2)?

Topic		Replies	Views
Traffic to a Pod located in a Dead Node General Discussions	2	1875	August 23, 2019
Kubernetes multpiple control plane nodes cluster, not working when one control plane node fails General Discussions	4	2767	September 7, 2023
K8s High Availability HaProxy setup General Discussions	1	1978	April 5, 2019
Traffic Still Routed to a Hung Node for Minutes — Is This a Kubernetes Design Limitation? General Discussions	0	28	March 30, 2026
A single NotReady master breaks all traffic in cluster General Discussions	0	314	July 31, 2023

Kubernetes routes traffic to unreachable pods after DC failure when control plane loses quorum

Related topics