Spread deployments across AZs (not replicas)

Hello guys,

Some information, I am working on EKS along with Karpenter. Cool set up and working fine most of the time.

In this use case, I have X deployments of an app, one for each team, so each deployment it is team customized.

Lets say we have 3 teams. I will have 3 deployments with 1 replica each (No need for more, yet). I need to balance these deployments across AZs.

It would work if my deployment contained more than 1 replica (lets say 3), kube-scheduler will place one replica in each AZ.

But in this case, each deployment only have 1 replica and for now I have 4 deployments. There’s is one in AZ-1 and the rest in AZ-2.

Is topologySpreadConstraints meant to only spread the pods that belong to a deployment? Or all the pods that match labels in the labelSelector field regardless if they are managed by other deployments?

The main reason to this question is because sometimes Karpenter(or kube-scheduler) tries to run everything in one subnet thus leading to IP exhaustion. We cannot create new subnets.

I have been working on this topic for the last week playing around with the topologySpreadConstraints and affinity/antiaffinity rules but, maybe due to the tunnel vision, cannot make it work.

Cluster information:

Kubernetes version: 1.30
Cloud being used: AWS
Installation method: Managed EKS
Host OS: Linux
CNI and version: v1.18.5
CRI and version: containerd://1.7.11

Interesting case. For Google Cloud it handled natively with this setting at cluster level that will distribute workload across zones.

I believe your best bet would be intercepting K8s API calls with webhook and add some logic before sending it to API server by looking at number of IPs in each zone and add some annotation or label selector at runtime to send that specific single pod request to a node that has more IP addresses or the node with more IPs subnet. It gets little complicated based on how familier are you with intercepting those calls at API layer.

1 Like