Multi-data center cluster

Hello all

I have to come up with different concepts on how to achieve high availability for a kubernetes cluster. the company I work at has two data centers (two different physical locations). One idea I had, was to build a cluster consisting of nodes in both data centers. so if one location shuts down, there is still the other one running. This wouldn’t be a multi-cluster setup but one cluster distributed over two data centers.

The information I gathered so far is:

  • in both data centers has to be at least one master node
  • latency of the network between data centers has to be short (does anybody have any numbers or experience on this?)
  • also I read in some article to use k3s for this, since it runs on SQL and not etcd, which makes it less sensitive too low performance environments (the network latency thing). Can anybody confirm this?

I am aware, that there are other (better) ways to achive high availability. This idea doesn’t have to be fleshed out perfectly. this is more like a theoretical attempt to compare this idea to other solutions. I would like to know, if it is possible and what challenges may come with it. Thanks!

In general we don’t suggest multi-region clusters for a number of the reasons you mention.

In practice, I know some people do it. For HA you generally want 3 control-plane nodes, which means you have 2 in one region and 1 in another, which puts you at risk of broken control-plane quorum if the larger region fails.

I’d say that you should think REALLY hard about failure couplings - what do you allow to fail together and what do you need to survive a failure of one.

Think worst case: Friday afternoon, a meteor hits one of the DCs - how bad is your weekend?