Kubernetes API service not responding

Hello,

I’m a newbie was trying to setup a basic Kubernetes environment with one Master node and one Worker node.

I’m facing this problem where constantly my master node API service abruptly stops responding, and if I restart ‘kubelet’ service it works for couple of minutes and again it stops and gives following error.

Please help me fix this problem because I stuck with this issue since last three days and i’m not able to figure out the root cause for this problem.

Thank you.

ubadmin@kubernetes-master:~$ kubectl get pods --all-namespaces
E0427 06:08:23.654226 13405 memcache.go:265] couldn’t get current server API group list: Get “https://X.X.X.X:6443/api?timeout=32s”: dial tcp X.X.X.X:6443: connect: connection refused
E0427 06:08:23.655158 13405 memcache.go:265] couldn’t get current server API group list: Get “https://X.X.X.X:6443/api?timeout=32s”: dial tcp X.X.X.X:6443: connect: connection refused
E0427 06:08:23.655813 13405 memcache.go:265] couldn’t get current server API group list: Get “https://X.X.X.X:6443/api?timeout=32s”: dial tcp X.X.X.X:6443: connect: connection refused
E0427 06:08:23.657208 13405 memcache.go:265] couldn’t get current server API group list: Get “https://X.X.X.X:6443/api?timeout=32s”: dial tcp X.X.X.X:6443: connect: connection refused
E0427 06:08:23.685930 13405 memcache.go:265] couldn’t get current server API group list: Get “https://X.X.X.X:6443/api?timeout=32s”: dial tcp X.X.X.X:6443: connect: connection refused
The connection to the server X.X.X.X:6443 was refused - did you specify the right host or port?

Hi, well I had a similar problem.
There seem to be a problem with the k8s version (1.24+) and certain Container-Runtime-Interfaces (CRI), namely Docker. The pod that host the kube-apiserver randomly lose connection to the other kube-system pods and that produce these errors.
It’s pretty complicated to understand why this happens but some github user successfully solved it by ditching the Docker CRI and switching to Rad-Hats CRI-O.
And this did the trick … at least for me :wink:
You may like this Link

So I’d suggest to either (completely) remove docker from your machine(s) or (even better) start with a fresh install of “you box”

Goto https://cri-o.io/ to get the latest CRI-O runtime FOR YOUR OS/KERNEL !
You’ll need to make sure the version fits, otherwise your system will become unstable.
You probably also want to upgrade kubelet, kubeadm and kubectl

If not already present, you may need to add the new CRI configuration:

mkdir /var/lib/kubelet
echo "--container-runtime=remote --container-runtime-endpoint=unix:///run/crio/crio.sock" >> /var/lib/kubelet/kubeadm-flags.env

Finally, enable the CRI-O Service:

systemctl daemon-reload
systemctl enable crio
systemctl start crio

Hi Martin,

Thanks for replying!

I tried the steps and installed Red hat CRI-O but still the API server is crashing abruptly.

Also I noticed, that randomly the pods go into ‘CrashLoopBackOff’ status and right after this the the API service is crashed, once I restart the ‘kubelet’ service things go back to normal for couple of minutes. FYI K8 master node resource is 4vCPUs with 8 GB of RAM.

Please help me with this.
Thank you.

puh … that’s gonna be a tough one :sweat_smile:

Well the kubelet “Service” is responsible tor tearing up/down pods on the local machines. And if it fails - for whatever reason - the whole become unstable. That’s exactly what happen to you.
So my first guess would be investigating into the kubelet logs to see if there’s something “suspicious”:
journalctl -u kubelet |tail -200
Also check the syslog
cat /var/log/syslog |tail -200

I reckon your config regarding CPU/Ram is ok. Of course it depends on what you’re planning to do, but for this curriculum (LFS258) it’s absolutely sufficient. However if you’d encounter some mem or other system bottlenecks you would also see them in /var/log/syslog

Hello Martin,

Finally I was able to configure the my lab by following this tutorial: https://www.linuxtechi.com/install-kubernetes-on-ubuntu-22-04/?unapproved=53072&moderation-hash=e2092ea8dd88b10870083b1ae3784777#comment-53072

The only difference in this tutorial is Initializing Kubernetes cluster with this command “kubeadm init --control-plane-endpoint=k8smaster” rather than using ‘–pod-network-cidr=192.168.0.0/16’. Also using Calico Pod network add-on instead of Flannel.

Thanks for replying to my query and helping me initially! :slight_smile:

1 Like

Thanks, this tutorial helped me as well.