Cluster information:
Kubernetes version: 1.24
Cloud being used: bare metal
Installation method: kubeadm
Host OS: Linux (Ubuntu)
CNI and version: Calico 0.3.1 (I think)
CRI and version: CRI-O (not sure of version)
I have been trying to figure out why Prometheus was unable to scrape metrics from kube-controller-manager
and kube-scheduler
. Prometheus kept reporting that they were down. After some research and a little luck, I was able to solve the problem by doing the following:
On the master node:
Edit /etc/kubernetes/manifests/kube-controller-manager.yaml.
Locate “spec.containers[0].command”.
Change address for “–bind-address” from “127.0.0.1” to “0.0.0.0”.
Repeat the same process above for /etc/kubernetes/manifests/kube-scheduler.yaml.
After saving the changes, the pods would automatically restart and eventually be accessible from within Prometheus (metrics available and alerts no longer firing).
The Kubernetes documentation (kube-controller-manager | Kubernetes and kube-scheduler | Kubernetes) indicates that the default for both of these is 0.0.0.0. Not sure how my cluster ended up with 127.0.0.1, but that’s neither here nor there.
What I did here is definitely not a viable long-term solution, nor does it follow the principles of GitOps. What methods are available to ensure that these applications are:
- Configured to be available from the start, i.e. when a cluster is initially set up.
- Re-configure an existing cluster such that the change in config can be captured and saved as IaC.