Connection issues with pods on Centos 8 nodes

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: 1.16
Cloud being used: bar-metal
Installation method: rke
Host OS: CentOS 8
CNI and version:
CRI and version:

I’ve successfully installed kubernetes 1.16 with rke 0.3.2 however, there is no connection from the pods outside.

connection from pod

bash-5.0# traceroute 172.217.168.14
traceroute to 172.217.168.14 (172.217.168.14), 30 hops max, 46 byte packets
 1  x.x.x.x (x.x.x.x)  0.012 ms  0.098 ms  0.004 ms
 2  *  *  *
 3  *  *  *
 4  *  *  *
 5  *c^C
 
bash-5.0# curl 172.217.168.14
curl: (7) Failed to connect to 172.217.168.14 port 80: Operation timed out

connection from node

The connection on the hosts seems fine

[user@node001 ~]$ ping google.com
PING google.com(fra16s25-in-x0e.1e100.net (2a00:1450:4001:825::200e)) 56 data bytes
64 bytes from fra16s25-in-x0e.1e100.net (2a00:1450:4001:825::200e): icmp_seq=1 ttl=57 time=4.84 ms
64 bytes from fra16s25-in-x0e.1e100.net (2a00:1450:4001:825::200e): icmp_seq=2 ttl=57 time=4.86 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 2ms
rtt min/avg/max/mdev = 4.836/4.850/4.864/0.014 ms

[user@node001 ~]$ curl 172.217.168.14
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

The k8s nodes themselves to not run firewalld and I checked several things which seem fine
selinux disabled

[user@node001 ~]$ cat /etc/sysconfig/selinux

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
# SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can take one of these three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected. 
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

sysctl

[user@node001 ~]$ sysctl net.bridge
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
....
[user@node001 ~]$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
[user@node001 ~]$ sysctl net.ipv6.ip_forward
net.ipv6.conf.all.forwarding = 1

iptables

sudo iptables -L -v -n
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 145K 6575K DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
 145K 6575K DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
  190 18756 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
   63  3780 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
  296 15245 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
  296 15245 DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
 145K 6575K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
  296 15245 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 145K 6575K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           
# Warning: iptables-legacy tables present, use iptables-legacy to see them

Not sure what else to check. Any hints

Did you install some CNI?

I am using canal

kube-system     canal-xf8g8                                2/2     Running     0          45h
kube-system     canal-zvrts                                2/2     Running     0          45h

Did you use --pod-network-cidr=10.244.0.0/16?

Well I used rke but did not specify any particular in that regards. By default it uses canal but there is nothing mentioned about cidr config

By default, the network plug-in is canal

Details about rke and network: https://rancher.com/docs/rancher/v2.5/en/faq/networking/cni-providers/

update

Regarding cidr I found this

Cluster CIDR ( cluster_cidr ) - The CIDR pool used to assign IP addresses to pods in the cluster. By default, each node in the cluster is assigned a /24 network from this pool for pod IP assignments. The default value for this option is 10.42.0.0/16 .

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network

I haven’t been using rke, but still trying to help you :slightly_smiling_face:

1 Like

I know and I appreciate it!

If I do a traceroute it goes no further than to the node

bash-5.0# traceroute 172.217.168.46
traceroute to 172.217.168.46 (172.217.168.46), 30 hops max, 46 byte packets
 1  x.x.x.14 (x.x.x.14)  0.005 ms  0.013 ms  0.002 ms
 2  *  *  *
 3  *  *^C

Is networking on your nodes correctly set? Default gateway and so?

I would say so. Setup is done via ansible using the defaults from hoster plus an additional vlan which connects the nodes internally, Here my ip route output

default via x.x.x.1 dev enp4s0 proto static metric 100 
x.x.x.1 dev enp4s0 proto static scope link metric 100 
x.x.x.14 dev enp4s0 proto kernel scope link src x.x.x.14 metric 100 
10.42.0.3 dev calie0ef90acfad scope link 
10.42.0.4 dev cali6d09fa47963 scope link 
10.42.0.5 dev calicfe92e547cd scope link 
10.42.0.6 dev cali83e218e70ab scope link 
10.42.1.0/24 via 10.42.1.0 dev flannel.1 onlink 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.100.0/24 dev enp4s0.4000 proto kernel scope link src 192.168.100.2 metric 400 

Maybe also useful to know, I use a multitool container for debugging, this is the config

 kubectl get pods -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP          NODE        NOMINATED NODE   READINESS GATES
multitool   1/1     Running   1          20h   10.42.0.4   x.x.x.14    <none>           <none>
bash-5.0# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if106528: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether ca:49:03:ec:75:43 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.42.0.4/32 scope global eth0
       valid_lft forever preferred_lft forever
4: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0

Hey, Did you have any success?

Unfortunately not yet. I’ve also tried with Calico instead of Canal with the same problem.

Following this guide now

Check BGP peer status is fine

[user@node001 ~]$ sudo ./calicoctl node status 
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+------------+-------------+
| PEER ADDRESS  |     PEER TYPE     | STATE |   SINCE    |    INFO     |
+---------------+-------------------+-------+------------+-------------+
| 192.168.100.2 | node-to-node mesh | up    | 2019-11-18 | Established |
+---------------+-------------------+-------+------------+-------------+

IPv6 BGP status
No IPv6 peers found.

and

[user@node002 ~]$ sudo ./calicoctl node status 
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+------------+-------------+
| PEER ADDRESS  |     PEER TYPE     | STATE |   SINCE    |    INFO     |
+---------------+-------------------+-------+------------+-------------+
| 192.168.100.1 | node-to-node mesh | up    | 2019-11-18 | Established |
+---------------+-------------------+-------+------------+-------------+

IPv6 BGP status
No IPv6 peers found.

However other checks fail e.g.

[user@node001 ~]$ ETCD_ENDPOINTS=http://localhost:2379 ./calicoctl get nodes
Failed to create Calico API client: dial tcp 127.0.0.1:2379: connect: connection refused

That could be a problem, docker changes default FORWARD policy to DROP.
Put the option --iptables=false in dockerd running command (docker.service) And try it again. Make sure that default FORWARD policy is ACCEPT now.

PS it worked for me

I did restart docker service with the option but don’t see any change

[ansible@node001 ~]$ sudo systemctl status  docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-11-25 14:27:17 CET; 33s ago
     Docs: https://docs.docker.com
 Main PID: 30212 (dockerd)
    Tasks: 48
   Memory: 66.7M
   CGroup: /system.slice/docker.service
           └─30212 /usr/bin/dockerd -H fd:// --iptables=false

and here the iptables

$ sudo iptables -L -v -n
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  440 20635 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
 1284 60510 DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
 1284 60510 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
2453K  110M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0   

Also when looking at the rules and the statistic DROP 0 packtes, 0 bytes I would say nothing seems blocked from the FORWARD rule.

I’ve restarted the host, so rules look different

$ sudo iptables -L -v -n
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0   

Still no connectivity

kubectl exec -it multitool  -- bash
bash-5.0# ping google.com
PING google.com (172.217.16.142) 56(84) bytes of data.
^C
--- google.com ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 160ms

bash-5.0# ping 172.217.16.142
PING 172.217.16.142 (172.217.16.142) 56(84) bytes of data.
^C
--- 172.217.16.142 ping statistics ---
15 packets transmitted, 0 received, 100% packet loss, time 341ms

is this due to this rule?

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 1175  155K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0   

So apparently the problem is cause centos8 uses nftables. Thus you have to update the calico-dameonset with the following environment variable

....
    Environment:
      FELIX_IPTABLESBACKEND:              NFT
....

Credits go to https://github.com/rancher/rke/issues/1788#issuecomment-565536201

1 Like