Kubernetes service load balancing


We are trying to use HPA to scale our system based on load and we noticed that HPA works correctly in creating new pods. but our main issue is that there is no fair load balancing between the pods, and the old pods handles most of the requests as in the attached image.

As specified by the cluster information below, we use kube-proxy v1.21.2-eksbuild.2 in iptables mode as it is the only supported mode by bottlerocket. we also use AWS appmesh controller v 1.4.1 with envoy images as service mesh.

We use ClusterIP service to point to the deployment, and the VirtualService is configured to use the DNS name of the service.

Is this behavior related happening because of kube-proxy working in iptables mode? or is it because of Appmesh and envoy?

Cluster information:

Kubernetes version: v1.21.2-eks-0389ca3
Cloud being used: AWS EKS
Installation method: Managed
Host OS: Bottlerocket 1.2.0
CNI and version: AWS vpc-cni v1.9.0-eksbuild.1
CRI and version: containerd://1.4.8+bottlerocket

I would be aiming to force the loadbalancer to use an algorithm that targets “lease connections”. Though I’m not quite sure how to do that yet.

My best guess is to start in this documentation though:

Hi @protosam,
In our situation, IPVS mode (which is the one supports specifying least connections) for kube-proxy is not supported for Bottlerocket OS on AWS EKS. so I am stick with iptables mode that does not have that option.
I just wanted to make sure that this behavior is indeed from kupe-proxy working in iptables mode, and it has nothing related AWS AppMesh and envoy proxy sidecars. if so, we want to try eBPF with calico.

It’s hard to say really, because iptables mode chooses a backend at random.

Keep in mind that any kube-proxy mode is effectively random, once you have more than one client node involved - they do not coordinate their selections nor do they get load info from backends.

Also keep in mind that many clients choose to reuse connections behind the scenes, so you might THINK you are doing many transactions, but at the TCP (and thus IPVs and iptables) level, it’s one connection to one backend. Common error in load tests.

Just to provide an update. After investigation with AWS EKS support, the issue was caused because of envoy proxies of app mesh.
The issue is that envoy was caching the endpoint ip instead of caching the cluster ip of the service.
So the solution was to change the cluster to headless service and change envoy to Strict DNS for service discovery to prevent the previous behavior.
Thanks all for collaboration.