Kubeadm init error - API server is not running

Cluster information:

Kubernetes version: 1.26
Cloud being used: (put bare-metal if not on a public cloud)
Installation method: kubeadm
Host OS: ubuntu 20.04
CNI and version: cni-plugins-linux-amd64-v1.3.0.tgz
CRI and version: containerd-1.7.5-linux-amd64.tar.gz

i tried to install kubeadm in the following steps, and run last step kubeadm, it is stucked with error. Could anyone advise what’s wrong with my steps?

  1. Preparation

    1.1 turn off swap, in the cloud base image with cloud init configuration, there is no swap comment in the /etc/fstab

    sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
    
  2. Install container runtime

    References: https://github.com/containerd/containerd/blob/main/docs/getting-started.md

    2.1 Installing containerd

    wget https://github.com/containerd/containerd/releases/download/v1.7.5/containerd-1.7.5-linux-amd64.tar.gz
    sudo tar Cxzvf /usr/local containerd-1.7.5-linux-amd64.tar.gz	sudo mkdir -p /usr/local/lib/systemd/system/
    sudo mv containerd.service /usr/local/lib/systemd/system/
    sudo systemctl daemon-reload
    sudo systemctl enable --now containerd
    

    2.2 Install systemd

    wget https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
    systemctl daemon-reload
    systemctl enable --now containerd
    

    2.3 Install runc

    wget https://github.com/opencontainers/runc/releases/download/v1.1.9/runc.amd64
    sudo install -m 755 runc.amd64 /usr/local/sbin/runc
    

    2.4 Install CNI plugins

    wget https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz
    sudo mkdir -p /opt/cni/bin
    sudo tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz
    

    2.5 Install and configure prerequisites

    cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF
    
    sudo modprobe overlay
    sudo modprobe br_netfilter
    
    cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-iptables  = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward                 = 1
    EOF
    
    sudo sysctl --system
    
    sudo sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
    
    
  3. Configure containerd to use systemd

i mannually create the file /etc/containerd/config.toml, and put the content below

[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

restart it

sudo systemctl restart containerd
  1. Installing kubeadm, kubelet and kubectl
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.26/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.26/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubect

  1. Run kubeadm init
sudo kubeadm init  \
   --pod-network-cidr=192.168.0.0/16 \
   --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
   -v5

The program is hanging up

I0909 14:11:41.025114    4493 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

Check kubelet logs through “sudo journalctl -xe -u kubelet”, it output errors below. the following is partial content.

Sep 09 14:14:00 k8sserver1 kubelet[4581]: E0909 14:14:00.925368    4581 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://192.168.5.140:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 192.168.5.140:6443: connect: connection refused
Sep 09 14:14:01 k8sserver1 kubelet[4581]: E0909 14:14:01.796425    4581 eviction_manager.go:261] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"k8sserver1\" not found"
Sep 09 14:14:02 k8sserver1 kubelet[4581]: W0909 14:14:02.941561    4581 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: Get "https://192.168.5.140:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 192.168.5.140:6443: connect: connection refused
Sep 09 14:14:02 k8sserver1 kubelet[4581]: E0909 14:14:02.941680    4581 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get "https://192.168.5.140:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 192.168.5.140:6443: connect: connection refused

I think we face the same problem (The GWF), it wasted me hours to figure out.

After changing the default sandbox image of containerd, it start worked.

containerd config default > /etc/containerd/config.toml
sed -i 's/registry.k8s.io/registry.aliyuncs.com\/google_containers/' /etc/containerd/config.toml
systemctl daemon-reload
systemctl restart containerd

I notice you already changed the sandbox image, did you call systemctl daemon-realod after that, this is the main difference between our scripts.

2 Likes

it’s useful