Cluster information:
Kubernetes version: 1.20.15 (I know EOL but sometime you got have these tech burdens. And it failed while upgrading. I’m dreaming of 1.26+)
Cloud being used: bare-metal, 3 master nodes(with kube-apiserver, kube-controller-manager and kube-scheduler as systemd services) and three quite strong worker
Installation method: no kubeadm, everything is systemd service - k8s the hard way
Host OS: Ubuntu Server
Problem
So, first off, I recently upgraded our cluster from 1.16.x to 1.20.15 version by version. Since 1.20 or 1.19, I get the error like this for the kube-scheduler and the kube-controller-manager
kube-scheduler[28140]: E0113 01:33:50.899838 28140 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: Get "http://127.0.0.1:8080/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0": dial tcp 127.0.0.1:8080: connect: connection refused
.
The reason is pretty simple to find out, if we look at the systemd service files. In the following, I will just show the things for kube-scheduler because it is equivalent to the kube-controller-manager.
kube-scheduler.service:
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
User=k8s
Group=k8s
ExecStart=/usr/local/bin/kube-scheduler \
--leader-elect=true \
--master=http://127.0.0.1:8080 \
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
As one can see, the admin before me just used the insecure connection for the connection to the master. This shouldn’t work with 1.19 anymore and will be completely disabled with 1.24, right? So, I have to change that. Consequently I changed --master
to https://10.0.10.10:6443
(the --advertise-address
of the kube-apiserver is 10.0.10.10
btw) and voilà - I have a new error:
kube-scheduler[17088]: E0113 01:45:42.134481 17088 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.0.10.5:6443/api/v1/services?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
That means, I have to provide the kube-scheduler/kube-controller-manager a certificate authority file for the connection to the kube-apiserver. However, I just don’t find a configuration flag for these services. I looked through the documentation, googled etc. but I can’t find it.That all brings me to the following questions:
- Is it correct to use the
--advertise-address
of the kube-apiserver as--master
for the kube-schedule, port 6443 and HTTPS? - Is there a way to provide a CA file for the connection to the kube-apiserver in order to authorize the certificate provided by the kube-apiserver?
- I already tried some approaches with a kubeconfig but don’t get it working. Is this the way to go?
Thank you in advance!