What causes kube-apiserver to (lag on startup) (die on idle)?

Cluster information:

Kubernetes version: 1.25.4
Cloud being used: bare-metal; personal laptop
Installation method: manual; Debian packages
Host OS: Debian GNU/Linux 11 (bullseye); Linux birl-work-laptop 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux
Container Runtime Interface (CRI) and version: containerd github.com/containerd/containerd 1.4.13~ds1 1.4.13~ds1-1~deb11u2
Container Network Interface (CNI) and version: Calico 3.24.5

So after self-solving my very first post, I was able to stumble-crawl-slither further along in my learning. I noticed that my problem revolves around kube-apiserver:

  • It takes anywhere from 10-40 seconds to start and accept connections.
    • until then I just get the problem with 6443 that I originally posted about.
  • It dies after 4 minutes.

Laptop resources being what they are (for a Dell Latitude5480): Four 2.8Ghz, 8G RAM (no swap) – a total of 41% used when the cluster runs – Im not sure if there’s a resource issue (RAM-wise), or something else?

I got tired of re-run commands from history, so I scripted the start-up process:
kubeadm init

[init] Using Kubernetes version: v1.25.4
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [birl-work-laptop kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.168]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [birl-work-laptop localhost] and IPs [10.0.0.168 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [birl-work-laptop localhost] and IPs [10.0.0.168 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 6.003183 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node birl-work-laptop as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node birl-work-laptop as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: x2xdk9.uvwgsdx5l8h5ymb6
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.168:6443 --token c7r78u.80bffvdljps6ybyc --discovery-token-ca-cert-hash sha256:47409a4353c83c3be0aadf2450cb3575b2ccae45e59177b5697b4a5fc278e935 

kubeadm init $?=0

kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   kube-apiserver-birl-work-laptop            0/1     Pending   0          2s
kube-system   kube-controller-manager-birl-work-laptop   0/1     Pending   0          2s

Sleeping until kube-apiserver is ready
10 seconds in: The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?
20 seconds in: The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?
30 seconds in:

poddisruptionbudget.policy/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
serviceaccount/calico-node created
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
deployment.apps/calico-kube-controllers created

kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY   STATUS    RESTARTS        AGE
kube-system   kube-apiserver-birl-work-laptop            0/1     Running   338 (14s ago)   47s
kube-system   kube-controller-manager-birl-work-laptop   1/1     Running   355 (44s ago)   47s

kubectl cluster-info

Kubernetes control plane is running at https://10.0.0.168:6443
CoreDNS is running at https://10.0.0.168:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

kubectl describe pod kube-apiserver-birl-work-laptop --namespace kube-system

Name:                 kube-apiserver-birl-work-laptop
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 birl-work-laptop/10.0.0.168
Start Time:           Fri, 02 Dec 2022 12:19:48 -0500
Labels:               component=kube-apiserver
                      tier=control-plane
Annotations:          kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.168:6443
                      kubernetes.io/config.hash: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.mirror: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.seen: 2022-12-02T12:19:03.442051549-05:00
                      kubernetes.io/config.source: file
Status:               Running
IP:                   10.0.0.168
IPs:
  IP:           10.0.0.168
Controlled By:  Node/birl-work-laptop
Containers:
  kube-apiserver:
    Container ID:  containerd://014aea93dd17d17a5ba0ea0952d097d325fab11e2e7ff3e36e3fff550020bf32
    Image:         registry.k8s.io/kube-apiserver:v1.25.4
    Image ID:      registry.k8s.io/kube-apiserver@sha256:ba9fc1737c5b7857f3e19183d1504ec58df0c50d970e0c008e58e8a13dc11422
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-apiserver
      --advertise-address=10.0.0.168
      --allow-privileged=true
      --authorization-mode=Node,RBAC
      --client-ca-file=/etc/kubernetes/pki/ca.crt
      --enable-admission-plugins=NodeRestriction
      --enable-bootstrap-token-auth=true
      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
      --etcd-servers=https://127.0.0.1:2379
      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
      --requestheader-allowed-names=front-proxy-client
      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
      --requestheader-extra-headers-prefix=X-Remote-Extra-
      --requestheader-group-headers=X-Remote-Group
      --requestheader-username-headers=X-Remote-User
      --secure-port=6443
      --service-account-issuer=https://kubernetes.default.svc.cluster.local
      --service-account-key-file=/etc/kubernetes/pki/sa.pub
      --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
      --service-cluster-ip-range=10.96.0.0/12
      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    State:          Running
      Started:      Fri, 02 Dec 2022 12:19:42 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Fri, 02 Dec 2022 12:19:04 -0500
      Finished:     Fri, 02 Dec 2022 12:19:41 -0500
    Ready:          False
    Restart Count:  338
    Requests:
      cpu:        250m
    Liveness:     http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=8
    Readiness:    http-get https://10.0.0.168:6443/readyz delay=0s timeout=15s period=1s #success=1 #failure=3
    Startup:      http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/ca-certificates from etc-ca-certificates (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /etc/pki from etc-pki (ro)
      /etc/ssl/certs from ca-certs (ro)
      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
      /usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  ca-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs
    HostPathType:  DirectoryOrCreate
  etc-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ca-certificates
    HostPathType:  DirectoryOrCreate
  etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:  DirectoryOrCreate
  usr-local-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates
    HostPathType:  DirectoryOrCreate
  usr-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/ca-certificates
    HostPathType:  DirectoryOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  13s   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          13s   kubelet  Container image "registry.k8s.io/kube-apiserver:v1.25.4" already present on machine
  Normal  Created         13s   kubelet  Created container kube-apiserver
  Normal  Started         13s   kubelet  Started container kube-apiserver

At this point, I now know it safe for me to connect in a worker node, if I want to. Has no affect on the eventual death of the apiserver, mind you.

I repeated the describe every 5 seconds until it dies. Here are the last 2 results:

**1**
Name:                 kube-apiserver-birl-work-laptop
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 birl-work-laptop/10.0.0.168
Start Time:           Fri, 02 Dec 2022 12:19:48 -0500
Labels:               component=kube-apiserver
                      tier=control-plane
Annotations:          kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.168:6443
                      kubernetes.io/config.hash: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.mirror: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.seen: 2022-12-02T12:19:03.442051549-05:00
                      kubernetes.io/config.source: file
Status:               Running
IP:                   10.0.0.168
IPs:
  IP:           10.0.0.168
Controlled By:  Node/birl-work-laptop
Containers:
  kube-apiserver:
    Container ID:  containerd://014aea93dd17d17a5ba0ea0952d097d325fab11e2e7ff3e36e3fff550020bf32
    Image:         registry.k8s.io/kube-apiserver:v1.25.4
    Image ID:      registry.k8s.io/kube-apiserver@sha256:ba9fc1737c5b7857f3e19183d1504ec58df0c50d970e0c008e58e8a13dc11422
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-apiserver
      --advertise-address=10.0.0.168
      --allow-privileged=true
      --authorization-mode=Node,RBAC
      --client-ca-file=/etc/kubernetes/pki/ca.crt
      --enable-admission-plugins=NodeRestriction
      --enable-bootstrap-token-auth=true
      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
      --etcd-servers=https://127.0.0.1:2379
      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
      --requestheader-allowed-names=front-proxy-client
      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
      --requestheader-extra-headers-prefix=X-Remote-Extra-
      --requestheader-group-headers=X-Remote-Group
      --requestheader-username-headers=X-Remote-User
      --secure-port=6443
      --service-account-issuer=https://kubernetes.default.svc.cluster.local
      --service-account-key-file=/etc/kubernetes/pki/sa.pub
      --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
      --service-cluster-ip-range=10.96.0.0/12
      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    State:          Running
      Started:      Fri, 02 Dec 2022 12:19:42 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Fri, 02 Dec 2022 12:19:04 -0500
      Finished:     Fri, 02 Dec 2022 12:19:41 -0500
    Ready:          True
    Restart Count:  338
    Requests:
      cpu:        250m
    Liveness:     http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=8
    Readiness:    http-get https://10.0.0.168:6443/readyz delay=0s timeout=15s period=1s #success=1 #failure=3
    Startup:      http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/ca-certificates from etc-ca-certificates (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /etc/pki from etc-pki (ro)
      /etc/ssl/certs from ca-certs (ro)
      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
      /usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  ca-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs
    HostPathType:  DirectoryOrCreate
  etc-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ca-certificates
    HostPathType:  DirectoryOrCreate
  etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:  DirectoryOrCreate
  usr-local-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates
    HostPathType:  DirectoryOrCreate
  usr-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/ca-certificates
    HostPathType:  DirectoryOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  4m    kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          4m    kubelet  Container image "registry.k8s.io/kube-apiserver:v1.25.4" already present on machine
  Normal  Created         4m    kubelet  Created container kube-apiserver
  Normal  Started         4m    kubelet  Started container kube-apiserver

**2**
The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?

I also run kubectl logs kube-apiserver-birl-work-laptop --namespace kube-system --follow=true > /tmp/kube-apiserver.log & which netted me only 123 lines this time – I have another log that’s just over 4500 lines.

Questions:

Q1: What’s leftover from kubeadm reset --force when you receive:

W1202 12:25:13.194314 1098157 reset.go:103] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://10.0.0.168:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 10.0.0.168:6443: connect: connection refused

I do rm -Rfv /etc/cni/net.d/*, but I wonder if something else lingers around?
(I also do iptables --flush --verbose.)

Q2: Is it normal behavior to wait up to 40s (in my case) for kube-apiserver to accept connections?

1 Like

I have exactly the same problem in two clean Arch installations (Kubernetes v1.25.4). Kube-apiserver is restarting continuously. May this be a bug?

At least yours restarts! :sweat_smile:

I replaced “containerd” with “cri-o” and now everything works.

How did you know to do that? That is, what lead you down that path to swap out CRIs?

Really two things: The fact that the Arch wiki provided more information about cri-o (so I guessed that it was more tested), and finally this article: Understanding how Kubernetes 1.24 and the dockershim deprecation broke the Kubelet in Arch Linux. | Medium (there the author recommends using cri-o after spending a whole day trying to make Kubernetes work with containerd).

Interesting. I’ll have to take some time to later to swap out for cri-o myself. But today aint that day.
Thanks for the idea.