Hi everyone,
Im trying to install Kubernetes cluster (1x Master, 1x Node - education purpose).
Im following this guide: Creating a cluster with kubeadm | Kubernetes
And I got stuck at: kubeadm init
kubelet logs show I am missing CNI. However, that desnt seem to be right as kubernetes-cni is installed (see at the bottom of this post)
What am I missing / what am I doing wrong?
ccd@master:~$ sudo kubeadm init --control-plane-endpoint master:6443 --pod-network-cidr 10.10.0.0/16
init] Using Kubernetes version: v1.23.6
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 192.168.1.190]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [192.168.1.190 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [192.168.1.190 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
kubelet logs shows my cluster is missing cni:
ccd@master:~$ journalctl -u kubelet -f
Apr 28 15:59:37 master systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 62.
Apr 28 15:59:37 master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Apr 28 15:59:37 master systemd[1]: Started kubelet: The Kubernetes Node Agent.
Apr 28 15:59:38 master kubelet[7546]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Apr 28 15:59:38 master kubelet[7546]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.062742 7546 server.go:446] "Kubelet version" kubeletVersion="v1.23.4"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.063108 7546 server.go:874] "Client rotation is on, will bootstrap in background"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.067385 7546 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.069892 7546 dynamic_cafile_content.go:156] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134696 7546 server.go:693] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134864 7546 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.134953 7546 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135029 7546 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135042 7546 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=true
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135069 7546 state_mem.go:36] "Initialized new in-memory state store"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135118 7546 kubelet.go:313] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135143 7546 client.go:80] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.135155 7546 client.go:99] "Start docker client with request timeout" timeout="2m0s"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145379 7546 docker_service.go:571] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145409 7546 docker_service.go:243] "Hairpin mode is set" hairpinMode=hairpin-veth
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.145480 7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147373 7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147426 7546 docker_service.go:258] "Docker cri networking managed by the network plugin" networkPluginName="cni"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.147498 7546 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 28 15:59:38 master kubelet[7546]: I0428 15:59:38.156636 7546 docker_service.go:264] "Docker Info" dockerInfo=&{ID:ZYZV:R6FY:CGJA:NOV2:WMQX:AJYB:PE2U:3XFS:Y7OP:OWDK:H5IN:2YOA Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:11 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:false KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:23 OomKillDisable:true NGoroutines:34 SystemTime:2022-04-28T15:59:38.14825269Z LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:5.4.0-107-generic OperatingSystem:Ubuntu 20.04.4 LTS OSVersion:20.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc000b2a230 NCPU:2 MemTotal:2079461376 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:master Labels:[] ExperimentalBuild:false ServerVersion:20.10.7 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[name=apparmor name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[WARNING: No swap limit support]}
Apr 28 15:59:38 master kubelet[7546]: E0428 15:59:38.156710 7546 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
Apr 28 15:59:38 master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Apr 28 15:59:38 master systemd[1]: kubelet.service: Failed with result 'exit-code'.
It seem it does not exist (at least it’s configuration):
ccd@master:~$ ls -al /etc/cni/net.d
ls: cannot access '/etc/cni/net.d': No such file or directory
ccd@master:~$ ls -al /etc/cni/
ls: cannot access '/etc/cni/': No such file or directory
But if I look at the package list it seems to be there!!!
ccd@master:~$ sudo apt -a list kubernetes-cni
Listing... Done
kubernetes-cni/kubernetes-xenial,now 0.8.7-00 amd64 [installed,automatic]
kubernetes-cni/kubernetes-xenial 0.8.6-00 amd64
kubernetes-cni/kubernetes-xenial 0.7.5-00 amd64
kubernetes-cni/kubernetes-xenial 0.6.0-00 amd64
kubernetes-cni/kubernetes-xenial 0.5.1-00 amd64
kubernetes-cni/kubernetes-xenial 0.3.0.1-07a8a2-00 amd64