0 downtime with 1 replica

Hi everyone,

I’v got a process to update cluster and node : im doing a cordon and then a drain. Since some deployments i have are using1 replica id like to achieve 0 downtime upgrade on these deployments.
I know i can use 2 replicas with pdb, i tested and its working but id like to try a method to have 0 downtime with 1 replica. I saw it was possible to use maxsurge to do that but can you explain me how to do that ? Thanks !

Hi @sashalarsoo

You’re absolutely right — achieving zero downtime with only 1 replica is inherently challenging, because kubectl drain will evict the only running pod, leading to temporary unavailability.

That said, here are a few insights and approaches:

  1. Understanding the limitation with 1 replica

With a single replica, any upgrade or node drain will cause a moment where no pods are running — hence downtime is almost unavoidable unless we somehow pre-create the new pod before terminating the old one.

  1. Why maxSurge can help (only in Deployments)

When using Deployment with rollingUpdate strategy, you can configure:

yaml

CopyEdit

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0

This allows creating a new pod before deleting the old one, even when you only have replicas: 1.

– However, this works only if you’re not draining the node before the new pod is up — otherwise, the pod gets evicted anyway and you lose availability. So this works best for application updates, not for node draining.

  1. How to leverage maxSurge effectively
  • Ensure you use a Deployment, not a StatefulSet or DaemonSet.
  • Set maxUnavailable: 0 and maxSurge: 1.
  • Do a rolling update of the Deployment, not a node drain, to trigger the zero-downtime rollout.
  1. Alternative approach for node maintenance

If you really need to drain nodes, but still avoid downtime with 1 replica, consider:

  • Temporarily scale to 2 replicas, do the node drain, and scale back to 1 after.
  • Or, do a rolling node upgrade without cordon/drain but by upgrading the nodes in-place (only works in some environments).

Let me know what controller type you’re using (Deployment, StatefulSet, etc.) and whether this is during app update or node upgrade, and I can give more tailored advice.

Hope this helps!