I have a Google Kubernetes Engine cluster with a single node pool with autoscaling enabled. Right now I’m sitting at 6 nodes with low utilization. CPU requested is at 10-30% node capacity (on each node). There are 3 nodes with 71% RAM requested, 1 node at 49%, and 2 more over 90%.
I would expect one of these nodes to be removed and its pods migrated elsewhere. All that these nodes have is a few (2-3) memory-intensive pods controlled by stateful sets. I’ve even tried setting the “cluster-autoscaler.kubernetes.io/safe-to-evict: true” annotation, but it does not seem to change anything.
How can I troubleshoot this? Why doesn’t the autoscaler scale down for me?
Hi @konrad-garus,
Did you figure this out? Mostly, I’m curious about this stuff as I’m a k8s noob.
One possibility is that the cluster autoscaler cannot schedule all the existing pods with fewer nodes.
Do you want to keep all the pods? If no, and assuming you have multiple replicas running, then maybe you need to also configure a Kubernetes horizontal autoscaler for your pods. In theory, that would allow it to retire some pod replicas and, hence, allow the cluster autoscaler to retire some nodes.
From How cluster autoscaler works
If nodes are under-utilized, and all Pods could be scheduled even with fewer nodes in the node pool, Cluster autoscaler removes nodes, down to the minimum size of the node pool.