I have a Google Kubernetes Engine cluster with a single node pool with autoscaling enabled. Right now I’m sitting at 6 nodes with low utilization. CPU requested is at 10-30% node capacity (on each node). There are 3 nodes with 71% RAM requested, 1 node at 49%, and 2 more over 90%.
I would expect one of these nodes to be removed and its pods migrated elsewhere. All that these nodes have is a few (2-3) memory-intensive pods controlled by stateful sets. I’ve even tried setting the “cluster-autoscaler.kubernetes.io/safe-to-evict: true” annotation, but it does not seem to change anything.
How can I troubleshoot this? Why doesn’t the autoscaler scale down for me?