Apiserver liveness and readiness probes fail randomly with code 500

Cluster information:

Kubernetes version:
Cloud being used: bare-metal
Installation method: kubeadm
Host OS: Ubuntu 22.04
CNI and version: Calico 3.23.3
CRI and version: containerd.io 1.6.6

Hi,

I’m having quite a lot of warning events, like:

kube-apiserver-k8cp1.170418e7b72f9344
kube-system
Unhealthy
Readiness probe failed: HTTP probe failed with statuscode: 500

or

kube-apiserver-k8cp3.17041b3d302fd052
kube-system
Unhealthy
Liveness probe failed: HTTP probe failed with statuscode: 500

They do not trigger pod restarts, but make me feel uneasy about my cluster health.

Is there a way to properly debug them?

Here is my apiserver manifest (already set readinessProbe periodSeconds from 1 to 5 without any benefit)

    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 10.0.50.31
        path: /livez
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-apiserver
    readinessProbe:
      failureThreshold: 3
      httpGet:
        host: 10.0.50.31
        path: /readyz
        port: 6443
        scheme: HTTPS
      periodSeconds: 5
      timeoutSeconds: 15
    resources:
      requests:
        cpu: 250m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 10.0.50.31
        path: /livez
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15

Also, curl to the probe api gives proper result, even if repeated:

ubuntu@k8cp1:~$ curl -k https://10.0.50.31:6443/readyz
ok

Anything else to try or set?

did you fix this. i have same problem. i believe its trying to access via https but would work insecurly but not sure where to set this so it works insecurly

Not yet, don’t know if there was a network or a protocol problem or anything else.

Hi.

I’ve had the same issue and fixed it with the commands

mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd

Hope this will help to you too