Kubeadm init version number bingo

During the autumn I did a week of training. LFS458. During that training I was quite successful at installing and running kubernetes 1.24 on Ubuntu 20.04 with containerd and calico. I have been able to reproduce what was done in class privately when locking on to the particular version numbers used in class - but trying to get it all working on more current version numbers of everything seem beyond what I am able to handle.

As the class material shows how to use kubeadm to upgrade a cluster I intend to run 1.25 on ubuntu 22.04 with a subsequent upgrade to 1.26. I’ve not been able to get kubeadm init to succeed on Ubuntu 22.04.

There are no containers running on my system:

root@k8scp:~# ctr container ls
CONTAINER    IMAGE    RUNTIME    
root@k8scp:~# ctr --namespace k8s.io container ls
CONTAINER    IMAGE    RUNTIME    
root@k8scp:~# crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD

network is simple and name resolution seem to work:

root@k8scp:~# ip -br a
lo               UNKNOWN        127.0.0.1/8 ::1/128 
enp1s0           UP             192.168.100.10/24 fe80::5054:ff:fed7:985e/64 
root@k8scp:~# getent hosts k8scp
192.168.100.10  k8scp
root@k8scp:~# getent hosts k8s
192.168.100.10  k8s

kubeadm config is trivial:

root@k8scp:~# cat kubeadm-config.yaml 
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.25.6
controlPlaneEndpoint: "k8s:6443"

yet kubeadm init fails:

root@k8scp:~# kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
[... all steps before the addons succeed ...]
[addons] Applied essential addon: CoreDNS
error execution phase addon/kube-proxy: error when creating kube-proxy service account: unable to create serviceaccount: Post "https://k8s:6443/api/v1/namespaces/kube-system/serviceaccounts?timeout=10s": dial tcp 192.168.100.10:6443: connect: connection refused
To see the stack trace of this error execute with --v=5 or higher

There is something listening to the :6443 port:

root@k8scp:~# ss -lnp | grep :6443
tcp   LISTEN 0      4096                                                                                    *:6443                   *:*    users:(("kube-apiserver",pid=3362,fd=7))           
root@k8scp:~# curl -k https://k8s:6443/
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}
root@k8scp:~# openssl s_client -connect k8s:6443 | openssl x509 -noout -text | grep k8s
                DNS:k8s, DNS:k8scp, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address:192.168.100.10

I even get the node partially working:

root@k8scp:~# kubectl get no
NAME    STATUS     ROLES           AGE     VERSION
k8scp   NotReady   control-plane   5m26s   v1.25.6

But the api is crashing quite frequently:

root@k8scp:~# kubectl -n kube-system get all
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
The connection to the server k8s:6443 was refused - did you specify the right host or port?
root@k8scp:~# systemctl restart kubelet
root@k8scp:~# kubectl -n kube-system get all
NAME                                READY   STATUS    RESTARTS       AGE
pod/coredns-565d847f94-c97gl        0/1     Pending   0              9m39s
pod/coredns-565d847f94-dt57b        0/1     Pending   0              9m39s
pod/etcd-k8scp                      0/1     Running   56 (18s ago)   8m53s
pod/kube-apiserver-k8scp            0/1     Running   46 (13s ago)   10m
pod/kube-controller-manager-k8scp   0/1     Running   22 (31s ago)   8m53s
pod/kube-scheduler-k8scp            0/1     Running   59 (33s ago)   10m

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   10m

NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/kube-proxy   1         0         0       0            0           kubernetes.io/os=linux   10m

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   0/2     2            0           10m

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-565d847f94   2         2         0       9m39s

What am I doing wrong? Seem like the smallest change in version numbering on any component leaves kubernetes defunct.

Cluster information:

Kubernetes version: apt-get install -y kubeadm=1.25.6-00 kubelet=1.25.6-00 kubectl=1.25.6-00
Cloud being used: bare-metal
Installation method: kubeadm
Host OS: Ubuntu 22.04.2 LTS
CNI and version: n/a
CRI and version: containerd containerd.io 1.6.18 2456e983eb9e37e47538f59ea18f2043c9a73640

IIRC, Ubuntu 21.04+ defaults to cgroupsv2 and there are some issues with for kubelet/containerd config where it should be set to use systemd instead of cgroupfs. I’d try setting the containerd config SystemdCgroup=true and restart containerd/kubelet.

Theres a bit more info here that might help if this is indeed the issue:

1 Like

I have no/default config for containerd:

root@k8scp:~# grep -v -e '^\s*#' -e '^\s*$' /etc/containerd/config.toml

So I generated one I could tweak:

root@k8scp:~# containerd config dump > /etc/containerd/config.toml
WARN[0000] containerd config version `1` has been deprecated and will be removed in containerd v2.0, please switch to version `2`, see https://github.com/containerd/containerd/blob/main/docs/PLUGINS.md#version-header 

Per the instructions you linked I switched SystemdCgroup to true:

root@k8scp:~# grep SystemdCgroup /etc/containerd/config.toml 
            SystemdCgroup = false
root@k8scp:~# sed -i '/SystemdCgroup/s/false/true/' /etc/containerd/config.toml
root@k8scp:~# grep SystemdCgroup /etc/containerd/config.toml 
            SystemdCgroup = true

I rebooted for good measure and after that kubeadm init succeeded!! Superthankyou!! :tada:

3 Likes

I had same issue while creating v1.29.3 cluster on my Ubuntu 22.04.3 LTS test machine via kubeadm init command. I followed the answer proposed by azzid and it solved my issue.
I generated containerd configuration file, restarted containerd using systemctl and rerun kubeadm init