RollingUpdate order in deployments and statefulsets

Paul_K · March 28, 2025, 9:24am

Hi everyone!

I have some application installed as Deployment in k8s. This application is not just serving incoming requests, but also establishing outbound connections to some outer service in the Internet at the moment of startup. Outer service limits connection number to N, and attempt to set N+1 connection would leads to errors on client side (rebalancing and reestablishing connections, something like this).

It usually happens during update, so I added RollingUpdate stratege to the deployment to prevent excess replicas scheduling:

  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

However, it didn’t solve my problem because of specific deployment rollout sequence:

set the old pod to terminating status (graceful shutdown takes some time in my case)
at the same moment schedule the new pod
keep the new pod actually running, but not ready while old is still terminating
set the new pod to ready after deletion of the old pod

Since readiness of the pod means only ability to serve inbound connections for kubernetes, it does not prevent pod from actually running and establishing outbound connections.

At the same time, StatefulSet would wait for old pod complete deletion before spawning new one.

It’s not like my application is actually stateful, but it seems the only way to prevent overspawning is to use StatefulSet.

So I have some questions:

why k8s assumes that application is not doing anything if it’s not available from outside? Web servers are not the only type of applications
is there a way to force deploymet rollout to wait for old pod actual deletion before scheduling the new one, or is it the fundamental difference between Deployment and StatefulSet?
should I consider limited outbound connection as a state?

Cluster information:

Kubernetes version: v1.30.9-gke.1127000
Cloud being used: Google

thockin · March 28, 2025, 5:03pm

Deployment is really managing pods in terms of readiness to serve. As soon as a pod is terminating, it is considered “done” from the Deployment’s POV and a new one is brought up.

AFAIK there is no way to ask for complete removal before replacement. It does not sound completely unreasonable but it doesn’t exist.

If you have such a finite pool, you need to handle retries internally, which you probably do anyway. Realistically, your outbound connections being limited effectively ARE state – just like a volume which cannot be double-mounted, you have to “hand off” these connection “slots”. You can either do that through some external mechanism (e.g. N leases) or simply try and retry.

If you don’t want to proceed in your rolling update until the connection is established, you might use a startupProbe which only becomes true once the connection is established, but be careful. Having a readiness probe which depends on an external service can lead to flapping of your service and overall worse experience.

Topic		Replies	Views
Rolling Upgrade of StatefulSets with Zero-Downtime for 3 or more replicas General Discussions	4	3329	January 15, 2019
Kubernetes RollingUpdate - how to respect readinessProbe? General Discussions	1	1742	January 20, 2021
Kubernetes deployment strategy General Discussions	5	4291	September 20, 2019
Deployment Rolling Update microk8s	1	1614	July 14, 2021
Custom Deployment/Statefulset Behavior Advice Request General Discussions	1	579	April 20, 2020

RollingUpdate order in deployments and statefulsets

Cluster information:

Related topics