NodePort on multiple nodes


#1

Hello everyone!

I have strange issue with NodePort service on multi nodes. Lets me explain full situation.

I have bare-metal kubernetes cluster with 3 nodes: 1 master and 2 worker nodes.
Workers nodes have the next external IPs (these IPs are only examples):

  • node1: 100.100.100.101
  • node2: 100.100.100.102

I have created Deployment (with one replica) for simple nginx container and NodePort service for this deployment using this configuration:

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: nginx-deployment
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx-example
      template:
        metadata:
          labels:
            app: nginx-example
        spec:
          containers:
          - name: nginx
            image: nginx:latest
            ports:
            - containerPort: 80
    ---
    kind: Service
    apiVersion: v1
    metadata:
      name: nginx-example-service
    spec:
      selector:
        app: nginx-example
      ports:
      - protocol: TCP
        targetPort: 80
        port: 80
        name: http
      type: NodePort

It means that I want to create access to this container via nodeIP:servicePort, for example - 100.100.100.102:80 should forward my request into this Nginx container.

Pod is created on node2:

NAME                                READY     STATUS    RESTARTS   AGE       IP                NODE
nginx-deployment-7fdcd5bc84-c5gnz   1/1       Running   0          4m        192.168.136.164   kubernetes-main-node-2

And service exists:

NAME                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
nginx-example-service   NodePort    10.106.51.84   <none>        80:32570/TCP   6s

But there is a strange thing appear. For origin documentation I see the next:

If you set the type field to NodePort, the Kubernetes master will allocate a port from a range specified by --service-node-port-range flag (default: 30000-32767), 
and **each Node** will proxy that port (the same port number on every Node) into your Service.

But it doesn’t work. Nodes do not forward nodes traffic from 80 port into this service.
Node1:
100.100.100.101:80 - KO
100.100.100.101:32570 - KO
Node2:
100.100.100.102:80 - KO
100.100.100.102:32570 - OK

Only node on which pod is running, forward this automatically allocated port (32570 from example).

Question 1: Why do not all nodes forward traffic into service?
Question 2: Why traffic from the port I specified (in this example it’s 80 port) is not forwarded to the service?
Question 3: What I’m doing wrong? :slight_smile:


#2

The nodeport should work across nodes. Are you sure your pod networking is configured correctly so that node-1 can access pods on node-2?

The service port only exists on the service IP. The node port exists on the node IP(s)


#3

yes, I confirm that I can connect to pod which is running on node 2 from node 1.

The node port exists on the node IP(s)

but how it should work when service is pointed to multiple pods which are running on different nodes? (For example deployment with multiple replicas which are running on different nodes)


#4

It does round robin between all the pods.

Let’s say you have a node port 1234. Then, if you connect to some node IP on port 1234 it will forward to one pod. You connect again and it will forward to some other pod. And this way, doing round robin.

If you look at the node IPtables rules,you will see this is done with the probability. These IPtables rules are updated by kube-proxy every time a pod is dies, is created or something.

This is the default and common way it work in kubernetes, but of course you can configure things differently. For example, the default is to use IPtables but there is (I think not GA still) support for doing it with IPVS.


#5

NodePorts turn the nodes into gateways.

Are you setting ‘externalTrafficPolicy’ by chance?


#6

Concerning your 2nd question :

“Why traffic from the port I specified (in this example it’s 80 port) is not forwarded to the service?”

That’s because you didn’t explictily specified the nodePort key in your service manifest, thus getting a random port. You might want to use such a manifest:

---
kind: Service
apiVersion: v1
metadata:
  name: nginx-example-service
spec:
  selector:
    app: nginx-example
  ports:
  - protocol: TCP
    targetPort: 80
    port: 80
    nodePort: 30080
    name: http
  type: NodePort

You can only choose an (unused !) port from the TCP ports in the range defined in the kube-apiserver manifest of your control-plane nodes:

--service-node-port-range flag (default: 30000-32767)

#7

ok, but what then is port key means?

but still, I know that in this case this port will allocated from range, because as I’ve described in original message this port is 32570 but even using this port I cannot connect to this service using Node1 IP.
It looks like that this port is binded only on node on which is pod currently running.


#8
  • targetPort is the port on which the pods are listening for incoming connections. In your example, your nginx pod are listening on port 80.
  • port is the port that internal cluster communications need to use to connect to the service. From a running pod, the service is available on http://<servicename>.<namespace>:<port> (you might use any scheme you want, not just only HTTP scheme)
  • nodePort is the port used for inbound connections originating from outside the cluster (in cases where an ingress is not an option). You can access this service from outside the cluster on http://<any_node_ip>:<nodePort>

In your example, it is expected that node1 is listening on port 32570. You should check the kube-apiserver logs on each master to investigate what’s going on. Maybe this port is already usedd by another process running outside of kubernetes ?


#9

https://speakerdeck.com/thockin/kubernetes-a-very-brief-explanation-of-ports


#10

What CNI are you using? I’ve had this issue using Calico when the bgp mesh wasn’t fully up. Port 32570 should work on all nodes and the control plane. If the bgp mesh isn’t established properly that traffic won’t be sent properly and you get what you are seeing.