K8s cni plugin issue: cni plugin not initialized while kube-flannel is running

Hi, I am new to Kubernetes. After ran kubeadm init --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16, I used flannel as my CNI. And it works fine until I ran kubeadm rest and init again. It came out an error in journalctl -f:

May 09 16:56:52 k8s-1 kubelet[3418632]: E0509 16:56:52.835517 3418632 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to setup network for sandb
ox \"a3d0dad567f549ae1a2b5b81632203f49da0f281e03e78785c956234cf4eea89\": plugin type=\"flannel\" failed (add): failed to delegate add: failed to set bridge addr: \"cni0\" already has an IP address differ
ent from 10.244.0.1/24" pod="kube-system/coredns-5bbd96d687-fqnfm"
May 09 16:56:52 k8s-1 kubelet[3418632]: E0509 16:56:52.835529 3418632 kuberuntime_manager.go:782] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to setup network for sandb
ox \"a3d0dad567f549ae1a2b5b81632203f49da0f281e03e78785c956234cf4eea89\": plugin type=\"flannel\" failed (add): failed to delegate add: failed to set bridge addr: \"cni0\" already has an IP address differ
ent from 10.244.0.1/24" pod="kube-system/coredns-5bbd96d687-fqnfm"

After searching, I chose to run the following commands:

ifconfig cni0 down    
ifconfig flannel.1 down    
ip link delete cni0
ip link delete flannel.1

After that, I got an error shows in journalctl -f:

May 09 18:12:05 server kubelet[7762]: E0509 18:12:05.424128    7762 kubelet.go:2760] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
May 09 18:12:10 server kubelet[7762]: E0509 18:12:10.425659    7762 kubelet.go:2760] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

However, I got a running flannel pod but coredns pods are still pending. And the cni0 network missed.

[root@server k8s]# kubectl get pods --all-namespaces
NAMESPACE      NAME                             READY   STATUS    RESTARTS   AGE
kube-flannel   kube-flannel-ds-55qbm            1/1     Running   0          39m
kube-system    coredns-7bdc4cb885-jrrck         0/1     Pending   0          41m
kube-system    coredns-7bdc4cb885-p4j7h         0/1     Pending   0          41m
kube-system    etcd-server                      1/1     Running   12         41m
kube-system    kube-apiserver-server            1/1     Running   14         41m
kube-system    kube-controller-manager-server   1/1     Running   12         41m
kube-system    kube-proxy-vt9pd                 1/1     Running   0          41m
kube-system    kube-scheduler-server            1/1     Running   12         41m


[root@server k8s]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:72:1a:9a brd ff:ff:ff:ff:ff:ff
    inet 192.168.39.6/24 brd 192.168.39.255 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::ad0f:3bd5:2d65:be2d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:d7:3f:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:d7:3f:83 brd ff:ff:ff:ff:ff:ff
10: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 6a:36:b3:6f:8e:f7 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::6836:b3ff:fe6f:8ef7/64 scope link
       valid_lft forever preferred_lft forever
11: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:80:25:3f:17 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
    inet6 fe80::42:80ff:fe25:3f17/64 scope link
       valid_lft forever preferred_lft forever
12: br-d4a7d41e15ba: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:ca:f4:1f:a7 brd ff:ff:ff:ff:ff:ff
    inet 172.19.0.1/16 brd 172.19.255.255 scope global br-d4a7d41e15ba
       valid_lft forever preferred_lft forever
13: br-e06ff2880e2b: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:25:39:95:04 brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.1/16 brd 172.20.255.255 scope global br-e06ff2880e2b
       valid_lft forever preferred_lft forever
14: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:88:50:92:b3 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
20: veth65ffc0a@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 3a:90:99:cc:f2:99 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::3890:99ff:fecc:f299/64 scope link
       valid_lft forever preferred_lft forever

Cluster information:

Kubernetes version: v1.27.1
Cloud being used: (put bare-metal if not on a public cloud)
Installation method: yum
Host OS: Centos7
CNI and version: flannel v0.21.5
CRI and version: containerd, 1.6.21

I got the same issue, couldn’t figure out the root cause, so I restarted the machine, after restarting, the problem was gone.

Before that I reset the cluster and re-initiazlize a new one, maybe something was not clean up, which cause the issue

2 Likes

F* me…a simple restart of the node made the trick for me as well. Thanks <3

1 Like

Ran into this problem again so I spent more time trying to solve it. Although I haven’t been able to pinpoint the root cause yet, I suspect it’s a containerd issue.

I can reproduce it with following steps:

  • kubeadm reset
  • sudo ip link set cni0 down && sudo ip link delete cni0
  • sudo rm -rf /etc/cni/net.d/
  • initialize the k8s again with flannel CNI plugin

restart containerd can solve this issue

sudo systemctl restart containerd

See also: Troubleshooting CNI plugin-related errors