NodePort service is not working from Master

Hi!

I have a Kubernetes cluster consist of 1 master and 2 worker nodes. I am using Calico pod network.

Cluster information:

Kubernetes version: 1.28.2
Cloud being used: bare metal
Installation method: dnf install
Host OS: CentOS 8

I have deployed a Traefik ingress controller with NodePort accessibility for out of cluster connectivity. Also deployed a very simple whoami service with one replica. I added the necessary ingress resource to access it using the Traefik controller from outside of the cluster. I believe, I added every other configuration.

When I try to access the whoami service using the nodePort assigned to Traefik, it does not work from master. No response, just a timeout. But it’s the case only from Master. Because, when I address any of the worker nodes with the same nodeport, it works fine and I get the whoami response.

I use firewalld service, and all the necessary port are opened. But it does not really matter, because if I disable the firewall, the call is still timing out just like it does with the firewall turned on.

First, I tried to check the Traefik deployment access log, but nothing is written there when I address the master. (The log show the “routing” as expected, when I address the worker nodes). So, I thought, there must have been some issue with handling the packages on Master.I tried to trace the iptables using the method described here, but basically, the log wasn’t moving and it looked like no packages had been hitting any rules/chains/anything on master when I targeted the nodeport.

I also have no network policies in place.

Tell me what config do you want to see, in order to understand what is going on, and I will update this post!

Here are some of my configs in advance:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-account
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: traefik-role

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses
      - ingressclasses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses/status
    verbs:
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: traefik-role-binding

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-role
subjects:
  - kind: ServiceAccount
    name: traefik-account
    namespace: default # Using "default" because we did not specify a namespace when creating the ClusterAccount.
---
apiVersion: v1
kind: Service
metadata:
  name: traefik-dashboard-service

spec:
  type: NodePort
  ports:
    - nodePort: 32001
      port: 8080
      targetPort: dashboard
  selector:
    app: traefik
---
apiVersion: v1
kind: Service
metadata:
  name: traefik-web-service

spec:
  type: NodePort
  ports:
    - nodePort: 32000
      targetPort: web
      port: 80
  selector:
    app: traefik
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: traefik-deployment
  labels:
    app: traefik

spec:
  replicas: 1
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      labels:
        app: traefik
    spec:
      serviceAccountName: traefik-account
      containers:
        - name: traefik
          image: traefik:v2.10
          args:
            - --api.insecure
            - --providers.kubernetesingress
            - --accesslog
          ports:
            - name: web
              containerPort: 80
            - name: dashboard
              containerPort: 8080

Ports opened on my master (I’d like to use the application on por 32000):

  ports: 1903/tcp 6443/tcp 8080/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 2379-2380/tcp 5473/tcp 179/tcp 4789/udp 8285/udp 8472/udp 30000-32767/tcp

And the whoami ingress definition, which is supposed to work across the cluster:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami-ingress-test
spec:
  rules:
  - http:
      paths:
      - path: /whoamitest
        pathType: Prefix
        backend:
          service:
            name: whoami
            port:
              name: web

Any help would be highly appreciated!

Hi,
Check iptables rules on the master node. There should be chains for traefik services.

Hi!

Here are the IPTables rules fro Traefik on master:

sd1@kubemaster ~]$ sudo iptables-save | grep traefik
-A KUBE-SERVICES -d 10.104.212.41/32 -p tcp -m comment --comment "default/traefik-web-service cluster IP" -m tcp --dport 80 -j KUBE-SVC-QJO4PTA6RFWJGKGL
-A KUBE-SERVICES -d 10.96.53.160/32 -p tcp -m comment --comment "default/traefik-dashboard-service cluster IP" -m tcp --dport 8080 -j KUBE-SVC-HW3AZ37GGTDKXDED
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/traefik-web-service" -m tcp --dport 32000 -j KUBE-EXT-QJO4PTA6RFWJGKGL
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/traefik-dashboard-service" -m tcp --dport 32001 -j KUBE-EXT-HW3AZ37GGTDKXDED
-A KUBE-EXT-QJO4PTA6RFWJGKGL -m comment --comment "masquerade traffic for default/traefik-web-service external destinations" -j KUBE-MARK-MASQ
-A KUBE-SVC-QJO4PTA6RFWJGKGL ! -s 192.168.0.0/16 -d 10.104.212.41/32 -p tcp -m comment --comment "default/traefik-web-service cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SVC-QJO4PTA6RFWJGKGL -m comment --comment "default/traefik-web-service -> 192.168.29.133:80" -j KUBE-SEP-3LBWT47OILQH4Q6L
-A KUBE-SEP-3LBWT47OILQH4Q6L -s 192.168.29.133/32 -m comment --comment "default/traefik-web-service" -j KUBE-MARK-MASQ
-A KUBE-SEP-3LBWT47OILQH4Q6L -p tcp -m comment --comment "default/traefik-web-service" -m tcp -j DNAT --to-destination 192.168.29.133:80
-A KUBE-EXT-HW3AZ37GGTDKXDED -m comment --comment "masquerade traffic for default/traefik-dashboard-service external destinations" -j KUBE-MARK-MASQ
-A KUBE-SVC-HW3AZ37GGTDKXDED ! -s 192.168.0.0/16 -d 10.96.53.160/32 -p tcp -m comment --comment "default/traefik-dashboard-service cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SVC-HW3AZ37GGTDKXDED -m comment --comment "default/traefik-dashboard-service -> 192.168.29.133:8080" -j KUBE-SEP-YCETS436RCO4BVYE
-A KUBE-SEP-YCETS436RCO4BVYE -s 192.168.29.133/32 -m comment --comment "default/traefik-dashboard-service" -j KUBE-MARK-MASQ
-A KUBE-SEP-YCETS436RCO4BVYE -p tcp -m comment --comment "default/traefik-dashboard-service" -m tcp -j DNAT --to-destination 192.168.29.133:8080

And the rules are very similar on the worker nodes except the first 2:

$ iptables-save | grep traefik
-A cali-pro-_NQSRaCm5vqwjXn5AsY -m comment --comment "cali:ymVIomi_YfYPTrR1" -m comment --comment "Profile ksa.default.traefik-account egress"
-A cali-pri-_NQSRaCm5vqwjXn5AsY -m comment --comment "cali:t7z67GuJglL2M9jI" -m comment --comment "Profile ksa.default.traefik-account ingress"
-A KUBE-SERVICES -d 10.96.53.160/32 -p tcp -m comment --comment "default/traefik-dashboard-service cluster IP" -m tcp --dport 8080 -j KUBE-SVC-HW3AZ37GGTDKXDED
-A KUBE-SERVICES -d 10.104.212.41/32 -p tcp -m comment --comment "default/traefik-web-service cluster IP" -m tcp --dport 80 -j KUBE-SVC-QJO4PTA6RFWJGKGL
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/traefik-dashboard-service" -m tcp --dport 32001 -j KUBE-EXT-HW3AZ37GGTDKXDED
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/traefik-web-service" -m tcp --dport 32000 -j KUBE-EXT-QJO4PTA6RFWJGKGL
-A KUBE-EXT-HW3AZ37GGTDKXDED -m comment --comment "masquerade traffic for default/traefik-dashboard-service external destinations" -j KUBE-MARK-MASQ
-A KUBE-SVC-HW3AZ37GGTDKXDED ! -s 192.168.0.0/16 -d 10.96.53.160/32 -p tcp -m comment --comment "default/traefik-dashboard-service cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SVC-HW3AZ37GGTDKXDED -m comment --comment "default/traefik-dashboard-service -> 192.168.29.133:8080" -j KUBE-SEP-YCETS436RCO4BVYE
-A KUBE-EXT-QJO4PTA6RFWJGKGL -m comment --comment "masquerade traffic for default/traefik-web-service external destinations" -j KUBE-MARK-MASQ
-A KUBE-SVC-QJO4PTA6RFWJGKGL ! -s 192.168.0.0/16 -d 10.104.212.41/32 -p tcp -m comment --comment "default/traefik-web-service cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SVC-QJO4PTA6RFWJGKGL -m comment --comment "default/traefik-web-service -> 192.168.29.133:80" -j KUBE-SEP-3LBWT47OILQH4Q6L
-A KUBE-SEP-YCETS436RCO4BVYE -s 192.168.29.133/32 -m comment --comment "default/traefik-dashboard-service" -j KUBE-MARK-MASQ
-A KUBE-SEP-YCETS436RCO4BVYE -p tcp -m comment --comment "default/traefik-dashboard-service" -m tcp -j DNAT --to-destination 192.168.29.133:8080
-A KUBE-SEP-3LBWT47OILQH4Q6L -s 192.168.29.133/32 -m comment --comment "default/traefik-web-service" -j KUBE-MARK-MASQ
-A KUBE-SEP-3LBWT47OILQH4Q6L -p tcp -m comment --comment "default/traefik-web-service" -m tcp -j DNAT --to-destination 192.168.29.133:80

Thank you. Try to send a http request to pod ip from the master node.

Hi!

Tried. Same result. Time out. I tried with both the NodePort and the target port of the actual service running in the node. I know…trying the node port does not really makes sense when you address the pod itself and not the host. Just tried that one as well.

Hi,
I assume there is a problem with CNI plugin. By some reason these is no communication with pods from other nodes.
According to the Component architecture | Calico Documentation, it is using BGP for inter-node route exchange. Check BGP peering and routing table as suggested in Troubleshooting commands | Calico Documentation

Unfortunately, everything, every output of the calicoctl command is similar to the ones described on the webpage, you linked in. I decided to delete my whole cluster, and reinstall everything with another pod network. E.g. Flannell. Let’s see if that does the same. If the same issue appears with that one as well, then it might be a problem with the host machine firewall or something. Some weird policy or who knows.

I’ll report it back here, how it goes.

Hey @fox-md !

I’ve bulldozed down everything and reinstalled my cluster. But I used Flannel instead of Calico. And voila, it works as expected!! I can access the Traefik ingress service from every node, workers or master, does not matter.

I think the issue was the Calico CNI, as you mentioned. Somehow, it didn’t work out. Flannel looks better, at least it works as I expect.

Thanks for the help.