Missing CNI during kubeadm init

czezz · April 28, 2022, 4:18pm

Hi everyone,
Im trying to install Kubernetes cluster (1x Master, 1x Node - education purpose).
Im following this guide: Creating a cluster with kubeadm | Kubernetes

And I got stuck at: kubeadm init
kubelet logs show I am missing CNI. However, that desnt seem to be right as kubernetes-cni is installed (see at the bottom of this post)
What am I missing / what am I doing wrong?

ccd@master:~$ sudo kubeadm init --control-plane-endpoint master:6443 --pod-network-cidr 10.10.0.0/16
init] Using Kubernetes version: v1.23.6
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 192.168.1.190]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [192.168.1.190 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [192.168.1.190 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

kubelet logs shows my cluster is missing cni:

ccd@master:~$ journalctl -u kubelet -f
Apr 28 15:59:37 master systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 62.
Apr 28 15:59:37 master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Apr 28 15:59:37 master systemd[1]: Started kubelet: The Kubernetes Node Agent.
Apr 28 15:59:38 master kubelet[7546]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Apr 28 15:59:38 master kubelet[7546]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.062742    7546 server.go:446] "Kubelet version" kubeletVersion="v1.23.4"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.063108    7546 server.go:874] "Client rotation is on, will bootstrap in background"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.067385    7546 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.069892    7546 dynamic_cafile_content.go:156] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134696    7546 server.go:693] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134864    7546 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134953    7546 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135029    7546 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135042    7546 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=true
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135069    7546 state_mem.go:36] "Initialized new in-memory state store"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135118    7546 kubelet.go:313] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135143    7546 client.go:80] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135155    7546 client.go:99] "Start docker client with request timeout" timeout="2m0s"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145379    7546 docker_service.go:571] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145409    7546 docker_service.go:243] "Hairpin mode is set" hairpinMode=hairpin-veth
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145480    7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147373    7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147426    7546 docker_service.go:258] "Docker cri networking managed by the network plugin" networkPluginName="cni"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147498    7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.156636    7546 docker_service.go:264] "Docker Info" dockerInfo=&{ID:ZYZV:R6FY:CGJA:NOV2:WMQX:AJYB:PE2U:3XFS:Y7OP:OWDK:H5IN:2YOA Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:11 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:false KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:23 OomKillDisable:true NGoroutines:34 SystemTime:2022-04-28T15:59:38.14825269Z LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:5.4.0-107-generic OperatingSystem:Ubuntu 20.04.4 LTS OSVersion:20.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc000b2a230 NCPU:2 MemTotal:2079461376 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:master Labels:[] ExperimentalBuild:false ServerVersion:20.10.7 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[name=apparmor name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[WARNING: No swap limit support]}
Apr 28 15:59:38 master kubelet[7546]: E0428 15:59:38.156710    7546 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
Apr 28 15:59:38 master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Apr 28 15:59:38 master systemd[1]: kubelet.service: Failed with result 'exit-code'.

It seem it does not exist (at least it’s configuration):

ccd@master:~$ ls -al /etc/cni/net.d
ls: cannot access '/etc/cni/net.d': No such file or directory
ccd@master:~$ ls -al /etc/cni/
ls: cannot access '/etc/cni/': No such file or directory

But if I look at the package list it seems to be there!!!

ccd@master:~$ sudo apt -a list kubernetes-cni
Listing... Done
kubernetes-cni/kubernetes-xenial,now 0.8.7-00 amd64 [installed,automatic]
kubernetes-cni/kubernetes-xenial 0.8.6-00 amd64
kubernetes-cni/kubernetes-xenial 0.7.5-00 amd64
kubernetes-cni/kubernetes-xenial 0.6.0-00 amd64
kubernetes-cni/kubernetes-xenial 0.5.1-00 amd64
kubernetes-cni/kubernetes-xenial 0.3.0.1-07a8a2-00 amd64

tej-singh-rana · April 29, 2022, 12:21pm

Hey @czezz
This is the actual reason.

czezz · May 1, 2022, 6:59pm

Hi @tej-singh-rana

Thank you very much for your reply.
Yes, indeed that was the problem.

In case someone bump into the same problem here is how to change docker’s cgroup: How to change the cgroup driver from cgroupfs systemd in docker? - DevOpsSchool.com

And here is in short step-by-step (as root):
NOTE!!! These steps must be repeated on all workers/nodes


1. systemctl status docker
? docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) <--- take this file  


2. cp -a /lib/systemd/system/docker.service /lib/systemd/system/docker.service.orig
3. vi /lib/systemd/system/docker.service
FROM:
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
TO:
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd

4. Restart the Docker service by running the following command:
systemctl daemon-reload
systemctl restart docker


5. Verify the cgroups driver to systemd
root@master:~# systemctl status docker
...
     CGroup: /system.slice/docker.service
             +-35585 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd <-- Look here
..

root@master:~# docker info (look for: Cgroup Driver:)

Topic		Replies	Views
Kubeadm init failed,Maybe it's a containerd issue? General Discussions	0	566	June 6, 2024
[beginner] kubeadm init fails with timeout General Discussions	2	3319	January 29, 2021
Kubeadm init error on ubuntu 20.04 General Discussions	2	4424	January 10, 2024
Kubeadm init error in [wait-control-plane] phase General Discussions	2	3334	July 27, 2023
Kubeadm init error: addon/coredns unable to create serviceaccount! General Discussions coredns	0	4011	November 17, 2022

Missing CNI during kubeadm init

Related topics