Load Balancing not working properly

I have a golang server running as load balancer and listening on udp ports, when i’m putting load on it ,it is directing the load only to one pod and not balancing it, and with less load creating multiple replicas, and restarting the pods again and again, is with UDP is there any limitation in HPA ?

Cluster information:

Kubernetes version: v1.22.2
Cloud being used: metlallb
Installation method: helm
Host OS: ubuntu20

Need more info.

Who is doing the load-balancing, your app or a Kubernetes Service?

Are you load-testing from a single client or many?

Defined HPA service in Kubernetes which is doing the load balancing.
Client is single which sends UDP data to pod and then pod does some parsing and sends data to a tcp client.
hpa.yaml

kind: HorizontalPodAutoscaler
metadata:
 name: temp-hpa
 namespace: "{{ .Values.global.Release.Namespace }}"
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: application
 minReplicas: {{ .Values.autoscaling.minReplicas }}
 maxReplicas: {{ .Values.autoscaling.maxReplicas }}
 targetCPUUtilizationPercentage: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}

HPA is not a load-balancer. Who decides “this packet should go to backend A, but that other packet should go to backend B” ?

Your post says “I have a golang server running as load balancer and listening on udp ports, when i’m putting load on it ,it is directing the load…”, which sounds like your application is making that decision.

Assumiong that’s a misunderstanding, the way Kubernetes generally does load-balancing on UDP is based on the 5-tuple { protocol, source IP, source port, destination IP, destination port } plus a timeout. Once a “connection” is made, any traffic from the same client IP+port to the same service+port will be sent to the same backend pod.

Thanks for understanding as i’m still learning about kubernetes and my framing of quesiton was wrong.
“Once a “connection” is made, any traffic from the same client IP+port to the same service+port will be sent to the same backend pod.” so how to tackle this situtation, as i have checked the client load is using the same port for sending the messages and my golang server is not doing any load balancing
I tried using different ports to send the message and it worked but when the load increases it showed the same behaviour like creating different pods but not directing the load to them.

This is how UDP load balancing works. Once a “connection” is made, there’s a fairly long time timeout, which is reset every time a packet is seen on that “connection”.

You have to use different source ports, or at least sometimes.

But in tcp, i’m observing the same type of problem so which type of “connection” is better suited for proper load balancing, or is there any other functionality which i have to explore to make sure that load properly divides.

If you are seeing it with TCP, you’ll have to explain more how you are doing testing.

Are you using a kube Service?

Is your clien in the cluster (using clusterIP) or outside (using a load-balancer)?

How is your client testing? Does it re-use connections (many HTTP clients do this behind the scenes)?

Mainly i have to do the testing with UDP, and for that i’m creating 5 sockets and sending the message in round robin fashing using each socket, with rate of 100 message per second.

Client is accessing the server from outside, and it is based on load balancer.

Kube-proxy and workalikes balance by “connection”. Since there’s no such thing as a UDP connection, the best we can do is the five tuple. So if that’s what you’re doing then I would expect you to get at most five backend pods.

But the problem still remains of all the load going on only one pod, and not getting distributed to other pods, with multiple sockets like 10 or more the distribution is happening correctly but creating this much sockets is not feasible.

I’m trying to tell you that if you use UDP and reuse the same client ip and port, and the same server IP and port, then you are going to get the same backend pod. That’s just how kubernetes works.

There isn’t some trick that you’re missing.

Now, if you want to file a bug report, it could be interesting to think about what we could do for UDP protocols which do not expect a reply. Maybe we could find a way to not track those sorts of connections, but it would be something you have to opt into. If you do need a reply then it gets more complicated. If you need more than one back and forth, then I have nothing for you.

But such a feature would need to be implemented by a number of components in the ecosystem, which would take time. And so you’re not going to have a solution of this form in the very near future.

In the meantime, you may need a distributed load generator for your app. Or you need to use more ports.