Calico 0/1 running error

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: 1.29.3
Cloud being used: (put bare-metal if not on a public cloud)
Installation method: kubeadm
Host OS: ubuntu 22.04 LTS
CNI and version: calico 3.27.2
CRI and version: cri-o 1.24.6

Hello,

I’m currently encountering an issue with Calico in my Kubernetes cluster. Although the pods are in the “Running” state, it seems that the container’s health checks are failing.

Here’s the information I’ve gathered so far:

  1. kubectl describe pod -n calico-system calico-kube-controllers-64cd7b9575-824cw

Name: calico-kube-controllers-64cd7b9575-824cw
Namespace: calico-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: calico-kube-controllers
Node: master/10.40.162.88
Start Time: Thu, 21 Mar 2024 03:33:05 +0000
Labels: /app.kubernetes.io/name=calico-kube-controllers
k8s-app=calico-kube-controllers
pod-template-hash=64cd7b9575
Annotations: /cni.projectcalico.org/containerID: 308ac1238adfa4aa37d735714ac7d6bc2e28629de4e0236cdd1e4448fd417cd6
/cni.projectcalico.org/podIP: 10.244.219.65/32
/cni.projectcalico.org/podIPs: 10.244.219.65/32
/hash.operator.tigera.io/system: fdde45054a8ae4f629960ce37570929502e59449
t/igera-operator.hash.operator.tigera.io/tigera-ca-private: 458bc1197c5be28d972bea15741e22d9e4c7cb10
Status: Running
IP: 10.244.219.65
IPs:
IP: 10.244.219.65
Controlled By: ReplicaSet/calico-kube-controllers-64cd7b9575
Containers:
calico-kube-controllers:
Container ID: cri-o://a2db70cbbaeec42ec8a7145b84d967d05d8f47028e084f12720aabe34597ca06
Image: /docker.io/calico/kube-controllers:v3.27.2
Image ID: /docker.io/calico/kube-controllers@sha256:d8a7bd92a38119c69ffce5152ffbdd59393be836bb66d572767ba1c3d21d97fa
Port:
Host Port:
SeccompProfile: RuntimeDefault
State: Running
Started: Thu, 21 Mar 2024 03:33:05 +0000
Ready: False
Restart Count: 0
Liveness: exec [/usr/bin/check-status -l] delay=10s timeout=10s period=60s #success=1 #failure=6
Readiness: exec [/usr/bin/check-status -r] delay=0s timeout=10s period=30s #success=1 #failure=3
Environment:
KUBE_CONTROLLERS_CONFIG_NAME: default
DATASTORE_TYPE: kubernetes
ENABLED_CONTROLLERS: node
FIPS_MODE_ENABLED: false
KUBERNETES_SERVICE_HOST: 10.96.0.1
KUBERNETES_SERVICE_PORT: 443
CA_CRT_PATH: /etc/pki/tls/certs/tigera-ca-bundle.crt
Mounts:
/etc/pki/tls/cert.pem from tigera-ca-bundle (ro,path=“ca-bundle.crt”)
/etc/pki/tls/certs from tigera-ca-bundle (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xghj4 (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tigera-ca-bundle:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: tigera-ca-bundle
Optional: false
kube-api-access-xghj4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: /kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
/node-role.kubernetes.io/control-plane:NoSchedule
/node-role.kubernetes.io/master:NoSchedule
/node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
/node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning Unhealthy 11m (x154 over 66m) kubelet Readiness probe errored: rpc error: code = Unknown desc = command error: EOF, stdout: , stderr: , exit code -1
Warning Unhealthy 63s (x65 over 65m) kubelet Liveness probe errored: rpc error: code = Unknown desc = command error: EOF, stdout: , stderr: , exit code -1

  1. kubectl exec -n calico-system -it calico-kube-controllers-64cd7b9575-824cw – /usr/bin/check-status -r
    Ready

3.kubectl exec -n calico-system -it calico-kube-controllers-64cd7b9575-824cw – /usr/bin/check-status -l
Ready

  1. The crictl ps command shows that the containers are running. (calico node , controller)

  2. calicoctl node status

Calico process is running.

IPv4 BGP status
±-------------±------------------±------±---------±------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
±-------------±------------------±------±---------±------------+
| 10.40.162.89 | node-to-node mesh | up | 03:33:10 | Established |
±-------------±------------------±------±---------±------------+

IPv6 BGP status
No IPv6 peers found.

It seems that the problem lies with the probes themselves rather than the container execution.
I would appreciate any insights or suggestions on how to troubleshoot and resolve this issue.

Thank you.