Note: Since the resource usages of all the containers are summed up the total pod utilization may not accurately represent the individual container resource usage. This could lead to situations where a single container might be running with high usage and the HPA will not scale out because the overall pod usage is still within acceptable limits.
Quote from Horizontal Pod Autoscaling | Kubernetes (2023-11-29)
The documentation warns about this problem, but no solutions is given.
My application has a high response time and returns some 503 status codes, because of high cpu usage on one pod.
Is there a ways to use HorizontalPodAutoscaler
on the highest cpu usage of the pods?
Should kubernetes integrate a feature to solve this or is this just a load balancer problem from Istio?
Of course a easy solutions would be to lower the averageUtilization
, but this makes the service inefficient.