LoadBalancer service not working

marcelk · March 24, 2019, 1:43am

I’m trying to figure out why my service (type: LoadBalancer) isn’t working. This is running on AWS.

I followed the steps in Debug Services - Kubernetes but can’t find any smoking gun.

In particular, when ssh’ing into a cluster node, the following all work:
nslookup <fq service name> <dns ip>
curl <service ip>:<port>
nslookup <external load balancer hostname>
kube-proxy is running
curl localhost:<nodeport>

What doesn’t work:
curl <external load balancer hostname>:<port>: Empty reply from server
curl <external load balancer ip>:<port>: Empty reply from server

I have another service running in the system that works just fine (meaning curl <load balancer host>:<port> returns something). I looked at the iptables entries for both services, but there doesn’t appear to be anything obvious missing for the non-working service.

(The one thing that didn’t agree with what Debug Services - Kubernetes expects is the content of /etc/resolve.conf:
domain us-west-2.compute.internal
search us-west-2.compute.internal
nameserver 172.20.0.2

However, this doesn’t appear to impact the working service.)

Can anyone suggest some next steps to get to the heart of the problem?

rata · March 24, 2019, 5:04am

What is the problem you see? Is a load balancer created on AWS (like browsing the AWS console)? Is there any event in the objects (svc)?

If you have the load balancer on AWS created, see the listener it is using (in it’s config). Is it listening in which port? And to which port is forwards traffic to the kubernetes nodes? Is that the port listed in the service as nodePort?

Do the kubernetes nodes accept traffic from the security group the load balancer is using? And if so, the security group the load balancer is using, accepts traffic from the internet in the ports it is listening?

I think one of those should be causing this issue, if I had to bet. But please try them all, and report back if it works or it doesn’t

marcelk · March 25, 2019, 4:23am

Hi Rodrigo, thanks for your response.

The one thing I noticed is that all of the instances for the broken service’s load balancer are listed as ‘out of service’ (and - not surprisingly - for the healthy service they are ‘in service’), so I need to dig into that.

rata · March 25, 2019, 11:16pm

Great. And which port is using for the health check? Is it using the nodePort of the service?

If it is, then my next bet would be that the security group the load balancer is using can’t connect to the workers on that node. So, I’d make sure to accept incoming connections from the security group the workers are running. Can you check that?

marcelk · March 26, 2019, 12:52am

Hi Rodrigo, your guess was correct, the health check was configured incorrectly (in this case, the nodeport it was using wasn’t serving anything). When I manually reconfigured the load balancer to use a different port, one that’s actually listened on, it worked.

Thanks for your help!

A follow-on question: I noticed that the load balancers have all ec2 instances that are port of the entire cluster registered. In my specific case that is undesirable because the (single) pod that’s backing the service can only run on a single node (via a node selector). Was it a conscious choice to register all cluster nodes with every load balancer, even when the backing pods can only run on a subset of the nodes?

rata · March 26, 2019, 1:46am

Glad it worked!

Yes, registering all is intended. The thing is the following:

The ELB does not know about pods, it knows about VMs instances. So, you can only register instances and not pods. Then, how do you distribute traffic evenly across pods? One option is to register all nodes as backends and then let kubernetes do the load balancing between pods (as kubernetes does know about pods).

This is basically what is happening, and the most common setup. The nodePort, as you seen the iptables rules, load balances between pods.

If you have a good reason to not want that (but I would recommend really having a good reason), you can change the behavior by changing the service external Traffic Policy setting to LocalOnly. If you do that, the health check will fail on all nodes except on the ones that are running a pod for the application.

That way, you will see only one node as healthy. But, then, if the pod is scheduled to some other node you need to wait for that node to become healthy for the AWS load balancer (with it’s params for healthy checking) and might introduce downtime and those not wanted stuff.

IMHO, it’s better to let kubernetes do the load balancing for pods, as it knows about pods. Amazon ELB doesn’t, so it’s not the best layer for the true load balancing.

mohibk · August 4, 2019, 2:14pm

Can you please tell how to let Kubernetes do the loadbalacning for pods as you mentioned above? Does it do it by-default if you expose a service like in the tutorial Hello Minikube - Kubernetes

using

kubectl expose deployment hello-node --type=LoadBalancer --port=8080

Would be helpfull if you can explain as I want loadbalancing across pods by kubernetes itself.

rata · August 4, 2019, 7:44pm

Yes, it is done when you expose the service like that. No need to do anything special

rata · August 6, 2019, 8:48am

Yes, it happens.

Load balancing is done using iptables rules (if using iptables for kube-proxy, if using something else, then something else). Basically, when a request goes to a service IP, iptables routes it to one of the pods using a probability (so, if you have 2 pods, 50% of the times, more or less, will go to one and 50% to another). Just normal iptables rules.

Maybe the service documentation in the kubernetes website expands on this? Have you checked that out?

mohibk · August 6, 2019, 2:13pm

@rata Thanks. I read the Services document which also describes the same. It says ‘By default, kube-proxy in iptables mode chooses a backend at random…
If kube-proxy is running in iptables mode and the first Pod that’s selected does not respond, the connection fails. This is different from userspace mode: in that scenario, kube-proxy would detect that the connection to the first Pod had failed and would automatically retry with a different backend Pod.’

Looks like above is the cause that one of my request fails?

rata · August 6, 2019, 4:56pm

Not sure about which problem you are talking about. It’s another thread? Will look later

mohibk · August 6, 2019, 5:29pm

Sorry for mixing two things. Yes it is the another thread that i reffered here Loadbalancing across pods in a service using minikube .I shall put this comment there for reference.

Topic		Replies	Views
Kubernetes LoadBalancer service not exposed to Browser General Discussions	0	1517	December 12, 2021
Loadbalancer service in AWS EC2 instances without using AKS General Discussions	3	1220	September 21, 2024
NodePort is not working General Discussions	1	3971	July 19, 2022
Providing a stable DNS name for load balancing General Discussions	0	606	May 12, 2020
Exposing kubernetes app using AWS Elastic LoadBalancer General Discussions	4	3353	February 1, 2019

LoadBalancer service not working

Related topics