Network Hairpin issue

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: v1.19.4
Cloud being used: (put bare-metal if not on a public cloud) bare-metal
Installation method:
Host OS: Centos 7
CNI and version: docker.io/weaveworks/weave-npc:2.7.
CRI and version: docker://19.3.14

You can format your yaml by highlighting it and pressing Ctrl-Shift-C, it will make your output easier to read.

Issue -

Followed debug instructions at Debug Services | Kubernetes

  1. kubectl create deployment hostnames --image=k8s.gcr.io/serve_hostname

  2. scaled to 3 and each of pods goes to 3 nodes

  3. kubectl get pods -l app=hostnames
    -o go-template=’{{range .items}}{{.status.podIP}}{{"\n"}}{{end}}’
    ==> Gives me 3 ip addresses which I used below

  4. bash into one of the pods (not the hostname pods from above but something else)
    for ep in 10.244.0.5:9376 10.244.0.6:9376 10.244.0.7:9376; do
    wget -qO- $ep
    done
    one of the pods returns a message “host not reachable” ==> Let us say node-08 where it is deployed to

  5. Expose a service
    $ kubectl expose deployment hostnames --port=80 --target-port=9376

  6. nslookup hostnames and endpoints point to each of the 3 addresses from step 3.
    . So we know service works…

  7. kube-proxy debug:
    iptables-save | grep hostnames ==> Gives proper output on each of nodes where “hostname” pod is running

  8. Finally looking at this section:
    Edge case: A Pod fails to reach itself via the Service IP

I see I have hairpin-veth enabled on all 3 nodes:
for intf in /sys/devices/virtual/net/cbr0/brif/*; do cat $intf/hairpin_mode; done ==> There are 4 files that match
This produces 1,1,0,1 on two nodes where I got a proper response with wget
on noe-08 where we are having issues “host not reachable” there are close to 50 files that match

What do you recommend as next step?