What causes kube-apiserver to (lag on startup) (die on idle)?

Cluster information:

Kubernetes version: 1.25.4
Cloud being used: bare-metal; personal laptop
Installation method: manual; Debian packages
Host OS: Debian GNU/Linux 11 (bullseye); Linux birl-work-laptop 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux
Container Runtime Interface (CRI) and version: containerd github.com/containerd/containerd 1.4.13~ds1 1.4.13~ds1-1~deb11u2
Container Network Interface (CNI) and version: Calico 3.24.5

So after self-solving my very first post, I was able to stumble-crawl-slither further along in my learning. I noticed that my problem revolves around kube-apiserver:

  • It takes anywhere from 10-40 seconds to start and accept connections.
    • until then I just get the problem with 6443 that I originally posted about.
  • It dies after 4 minutes.

Laptop resources being what they are (for a Dell Latitude5480): Four 2.8Ghz, 8G RAM (no swap) – a total of 41% used when the cluster runs – Im not sure if there’s a resource issue (RAM-wise), or something else?

I got tired of re-run commands from history, so I scripted the start-up process:
kubeadm init

[init] Using Kubernetes version: v1.25.4
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [birl-work-laptop kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.168]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [birl-work-laptop localhost] and IPs [10.0.0.168 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [birl-work-laptop localhost] and IPs [10.0.0.168 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 6.003183 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node birl-work-laptop as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node birl-work-laptop as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: x2xdk9.uvwgsdx5l8h5ymb6
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.168:6443 --token c7r78u.80bffvdljps6ybyc --discovery-token-ca-cert-hash sha256:47409a4353c83c3be0aadf2450cb3575b2ccae45e59177b5697b4a5fc278e935 

kubeadm init $?=0

kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   kube-apiserver-birl-work-laptop            0/1     Pending   0          2s
kube-system   kube-controller-manager-birl-work-laptop   0/1     Pending   0          2s

Sleeping until kube-apiserver is ready
10 seconds in: The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?
20 seconds in: The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?
30 seconds in:

poddisruptionbudget.policy/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
serviceaccount/calico-node created
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
deployment.apps/calico-kube-controllers created

kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY   STATUS    RESTARTS        AGE
kube-system   kube-apiserver-birl-work-laptop            0/1     Running   338 (14s ago)   47s
kube-system   kube-controller-manager-birl-work-laptop   1/1     Running   355 (44s ago)   47s

kubectl cluster-info

Kubernetes control plane is running at https://10.0.0.168:6443
CoreDNS is running at https://10.0.0.168:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

kubectl describe pod kube-apiserver-birl-work-laptop --namespace kube-system

Name:                 kube-apiserver-birl-work-laptop
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 birl-work-laptop/10.0.0.168
Start Time:           Fri, 02 Dec 2022 12:19:48 -0500
Labels:               component=kube-apiserver
                      tier=control-plane
Annotations:          kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.168:6443
                      kubernetes.io/config.hash: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.mirror: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.seen: 2022-12-02T12:19:03.442051549-05:00
                      kubernetes.io/config.source: file
Status:               Running
IP:                   10.0.0.168
IPs:
  IP:           10.0.0.168
Controlled By:  Node/birl-work-laptop
Containers:
  kube-apiserver:
    Container ID:  containerd://014aea93dd17d17a5ba0ea0952d097d325fab11e2e7ff3e36e3fff550020bf32
    Image:         registry.k8s.io/kube-apiserver:v1.25.4
    Image ID:      registry.k8s.io/kube-apiserver@sha256:ba9fc1737c5b7857f3e19183d1504ec58df0c50d970e0c008e58e8a13dc11422
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-apiserver
      --advertise-address=10.0.0.168
      --allow-privileged=true
      --authorization-mode=Node,RBAC
      --client-ca-file=/etc/kubernetes/pki/ca.crt
      --enable-admission-plugins=NodeRestriction
      --enable-bootstrap-token-auth=true
      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
      --etcd-servers=https://127.0.0.1:2379
      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
      --requestheader-allowed-names=front-proxy-client
      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
      --requestheader-extra-headers-prefix=X-Remote-Extra-
      --requestheader-group-headers=X-Remote-Group
      --requestheader-username-headers=X-Remote-User
      --secure-port=6443
      --service-account-issuer=https://kubernetes.default.svc.cluster.local
      --service-account-key-file=/etc/kubernetes/pki/sa.pub
      --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
      --service-cluster-ip-range=10.96.0.0/12
      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    State:          Running
      Started:      Fri, 02 Dec 2022 12:19:42 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Fri, 02 Dec 2022 12:19:04 -0500
      Finished:     Fri, 02 Dec 2022 12:19:41 -0500
    Ready:          False
    Restart Count:  338
    Requests:
      cpu:        250m
    Liveness:     http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=8
    Readiness:    http-get https://10.0.0.168:6443/readyz delay=0s timeout=15s period=1s #success=1 #failure=3
    Startup:      http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/ca-certificates from etc-ca-certificates (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /etc/pki from etc-pki (ro)
      /etc/ssl/certs from ca-certs (ro)
      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
      /usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  ca-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs
    HostPathType:  DirectoryOrCreate
  etc-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ca-certificates
    HostPathType:  DirectoryOrCreate
  etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:  DirectoryOrCreate
  usr-local-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates
    HostPathType:  DirectoryOrCreate
  usr-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/ca-certificates
    HostPathType:  DirectoryOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  13s   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          13s   kubelet  Container image "registry.k8s.io/kube-apiserver:v1.25.4" already present on machine
  Normal  Created         13s   kubelet  Created container kube-apiserver
  Normal  Started         13s   kubelet  Started container kube-apiserver

At this point, I now know it safe for me to connect in a worker node, if I want to. Has no affect on the eventual death of the apiserver, mind you.

I repeated the describe every 5 seconds until it dies. Here are the last 2 results:

**1**
Name:                 kube-apiserver-birl-work-laptop
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 birl-work-laptop/10.0.0.168
Start Time:           Fri, 02 Dec 2022 12:19:48 -0500
Labels:               component=kube-apiserver
                      tier=control-plane
Annotations:          kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.168:6443
                      kubernetes.io/config.hash: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.mirror: 0fa3a7ee79ee8742d4353760695ed280
                      kubernetes.io/config.seen: 2022-12-02T12:19:03.442051549-05:00
                      kubernetes.io/config.source: file
Status:               Running
IP:                   10.0.0.168
IPs:
  IP:           10.0.0.168
Controlled By:  Node/birl-work-laptop
Containers:
  kube-apiserver:
    Container ID:  containerd://014aea93dd17d17a5ba0ea0952d097d325fab11e2e7ff3e36e3fff550020bf32
    Image:         registry.k8s.io/kube-apiserver:v1.25.4
    Image ID:      registry.k8s.io/kube-apiserver@sha256:ba9fc1737c5b7857f3e19183d1504ec58df0c50d970e0c008e58e8a13dc11422
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-apiserver
      --advertise-address=10.0.0.168
      --allow-privileged=true
      --authorization-mode=Node,RBAC
      --client-ca-file=/etc/kubernetes/pki/ca.crt
      --enable-admission-plugins=NodeRestriction
      --enable-bootstrap-token-auth=true
      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
      --etcd-servers=https://127.0.0.1:2379
      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
      --requestheader-allowed-names=front-proxy-client
      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
      --requestheader-extra-headers-prefix=X-Remote-Extra-
      --requestheader-group-headers=X-Remote-Group
      --requestheader-username-headers=X-Remote-User
      --secure-port=6443
      --service-account-issuer=https://kubernetes.default.svc.cluster.local
      --service-account-key-file=/etc/kubernetes/pki/sa.pub
      --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
      --service-cluster-ip-range=10.96.0.0/12
      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    State:          Running
      Started:      Fri, 02 Dec 2022 12:19:42 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Fri, 02 Dec 2022 12:19:04 -0500
      Finished:     Fri, 02 Dec 2022 12:19:41 -0500
    Ready:          True
    Restart Count:  338
    Requests:
      cpu:        250m
    Liveness:     http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=8
    Readiness:    http-get https://10.0.0.168:6443/readyz delay=0s timeout=15s period=1s #success=1 #failure=3
    Startup:      http-get https://10.0.0.168:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/ca-certificates from etc-ca-certificates (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /etc/pki from etc-pki (ro)
      /etc/ssl/certs from ca-certs (ro)
      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
      /usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  ca-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs
    HostPathType:  DirectoryOrCreate
  etc-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ca-certificates
    HostPathType:  DirectoryOrCreate
  etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:  DirectoryOrCreate
  usr-local-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates
    HostPathType:  DirectoryOrCreate
  usr-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/ca-certificates
    HostPathType:  DirectoryOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  4m    kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          4m    kubelet  Container image "registry.k8s.io/kube-apiserver:v1.25.4" already present on machine
  Normal  Created         4m    kubelet  Created container kube-apiserver
  Normal  Started         4m    kubelet  Started container kube-apiserver

**2**
The connection to the server 10.0.0.168:6443 was refused - did you specify the right host or port?

I also run kubectl logs kube-apiserver-birl-work-laptop --namespace kube-system --follow=true > /tmp/kube-apiserver.log & which netted me only 123 lines this time – I have another log that’s just over 4500 lines.

Questions:

Q1: What’s leftover from kubeadm reset --force when you receive:

W1202 12:25:13.194314 1098157 reset.go:103] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://10.0.0.168:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 10.0.0.168:6443: connect: connection refused

I do rm -Rfv /etc/cni/net.d/*, but I wonder if something else lingers around?
(I also do iptables --flush --verbose.)

Q2: Is it normal behavior to wait up to 40s (in my case) for kube-apiserver to accept connections?

1 Like

At least yours restarts! :sweat_smile:

How did you know to do that? That is, what lead you down that path to swap out CRIs?

Interesting. I’ll have to take some time to later to swap out for cri-o myself. But today aint that day.
Thanks for the idea.