Preventing HPA Auto-Scaling during Kubernetes deployments when using custom metrics

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: 1.30.9-gke.1127000
Cloud being used: Google Cloud
Installation method: Managed GKE cluster
Host OS: Container-Optimized OS from Google (Kernel 6.1.85+)
CNI and version: Google’s default solution
CRI and version: default containerd

We have an app that automatically sets up Kubernetes HPA to our customers.

We are now using metrics from Prometheus as targets to HPA. These metrics are exported from our Java applications using the JMX exporter. We mostly use JVM internal memory usage metrics, such as prometheus.googleapis.com/jvm_memory_used_bytes/gauge.

We’re experiencing undesired auto-scaling events triggered by the Horizontal Pod Autoscaler (HPA) when we update Deployments images.

The Problem

We believe this is what is happening:

Using the RollingUpdate deployment strategy, Kubernetes creates a new pod before terminating an old one. This temporarily increases currentReplicas, affecting the HPA formula:

desiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]

It takes some time to this new pod to export its own metrics, though. Thus, the rate currentMetricValue / desiredMetricValue stays the same, and now the desired number of replicas is increased.

Example

Suppose I have two pods that are serving well the current demand. I update the version of the image in their deployment, so it starts a new pod with the new image. In this case, the values for the formula will be:

  • currentReplicas = 3 (includes new pod)
  • currentMetricValue = 6430859672 (same as before, since no metrics of the new pod were imported into Prometheus yet)
  • desiredMetricValue = 9172000000

But I do not need a third pod for the current demand, thus resulting in an undesired desiredReplicas = 3.

Considered solutions

We tried or considered these solutions:

  • Switching to the Recreate strategy avoids this issue but we cannot afford the associated downtime.
  • The --horizontal-pod-autoscaler-initial-readiness-delay didn’t help, we understood from the Kubernetes documentation it only affects the CPU resource metrics collection.
  • We are thinking about automatically disable the auto-scaling during deployment, but it is a last resort since it is more complex.

Is there a way, maybe some option in the HorizontalPodAutoscaler object, to prevent it from happening?