We’ve been seeing these types of events in our Kubernetes cluster from a time ago:
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
default 25m Normal RegisteredNode node/xx0910 We've been seeing these types of events in our Kubernetes cluster from time ago
default 25m Normal RegisteredNode node/xx0911 Node xx0911 event: Registered Node xx0911 in Controller
default 25m Normal RegisteredNode node/xx0912 Node xx0912 event: Registered Node xx0912 in Controller
default 25m Normal RegisteredNode node/xx0913 Node xx0913 event: Registered Node xx0913 in Controller
kube-state-metrics 32s Warning Unhealthy pod/kube-state-metrics-865c9cbfd8-qj4zw Liveness probe failed: HTTP probe failed with statuscode: 503
kube-system 33s Warning Unhealthy pod/calico-kube-controllers-68485cbf9c-q9g2p Liveness probe failed: Error verifying datastore: Get "https://xx.xx.xx.xx.:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded; Error reaching apiserver: Get "https://xx.xx.xx.xx.:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded with http status code: 500
kube-system 33s Warning Unhealthy pod/calico-kube-controllers-68485cbf9c-q9g2p Readiness probe failed: Error verifying datastore: Get "https://xx.xx.xx.xx.:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded; Error reaching apiserver: Get "https://xx.xx.xx.xx.:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded with http status code: 500
kube-system 32s Warning Unhealthy pod/kube-apiserver-xx0910 Readiness probe failed: HTTP probe failed with statuscode: 500
kube-system 34s Warning Unhealthy pod/kube-apiserver-xx0910 Liveness probe failed: HTTP probe failed with statuscode: 500
kube-system 25m Normal Pulled pod/kube-controller-manager-xx0910 Container image "registry.k8s.io/kube-controller-manager:v1.29.5" already present on machine
kube-system 25m Normal Created pod/kube-controller-manager-xx0910 Created container kube-controller-manager
kube-system 25m Normal Started pod/kube-controller-manager-xx0910 Started container kube-controller-manager
kube-system 25m Normal LeaderElection lease/kube-controller-manager xx0910a22e9d48-b7bc-4277-a31d-af87699878ff became leader
The cluster work well (or so we think), but we’re concerned about these events
We can split this events in two possibles errors:
- Nodes are being registered multiple times
- Calico failed access to datastore (ETCD)
We’ve verified that Calico is working properly, and all testing to ETCD is Ok.
We have deployed the cluster with kubespray and our releases are:
- Operating System: Red Hat Enterprise Linux release 8.10 (Ootpa)
- Kubernetes: v1.29.5
- ETCD: 3.5.12
- Calico: v3.27.3
- Server is a VM on VMWare
- Disks are over storage
Can anyone help me?