Major problems with kubernetes cluster

nik-c · May 5, 2021, 1:43am

Hello. Having a major problem running a multi node cluster of Kubernetes on docker 20.10. I’m running on 4 vsphere Redhat 8 hosts, with the following: calico/flannel (both tried), metallb, nginx ingress, cgroupfs docker driver (but also tried systemd as recommended with the same problem) as a base setup.

Everything goes fine with the cluster running solidly (most recently running a kubeapps installation) until a random point (sometimes when running a port forward to access a service in the browser) when the command on linux freezes. On trying to login to the server again, the login prompt no longer appears. Kubernetes still runs in the background but I’m no longer able to access the server using ssh.

Usually it’s the master node that this happens on, but also the second node had a similar problem. While running traceroute, it appears that traffic for the first node is running through the fourth node (but that may not be related).

Are there any issues with vsphere virtual machines, calico/flannel, metallb or nginx ingress?

According to a colleague, it appears the network service is causing the problem as during reboot the service just stops working.Has anybody heard of this problem and is there a solution?

Cluster information:

Kubernetes version: 17
Cloud being used: bare metal
Installation method: manual
Host OS: Redhat 8

Topic		Replies	Views
I'm missing something about how services and ingress work, help please? General Discussions	0	566	November 3, 2021
Unable to ssh to the server after kubernetes clustering is deployed General Discussions	0	935	December 31, 2018
Kubernetes networking issues General Discussions	0	73	August 16, 2024
Debugging metallb problems microk8s	3	1667	July 2, 2023
Ingress access from outside cluster General Discussions	1	1114	March 7, 2023

Major problems with kubernetes cluster

Cluster information:

Related topics