Kubernetes becomes unavailable; connection to server was refused

Hello. I recently tried installing Kubernetes on three virtual machines all running Ubuntu 22. I mostly followed the steps in this tutorial, except I used Calico for Pod Networking. (I did not install calicoctl.)

This actually worked (sort of) but when I signed out and came back in the morning I stopped being able to query the master:

$ kubectl get nodes
E0215 16:03:14.732092 1753 memcache.go:238] couldn’t get current server API group list: Get “https://k8smaster.myserver.dev:6443/api?timeout=32s”: dial tcp 192.168.1.9:6443: connect: connection refused
The connection to the server k8smaster.myserver.dev:6443 was refused - did you specify the right host or port?

I think ~/.kube directory is set up correctly.

$ kubectl config view
apiVersion: v1
clusters:

  • cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://k8smaster.myserver.dev:6443
    name: kubernetes
    contexts:
  • context:
    cluster: kubernetes
    user: kubernetes-admin
    name: kubernetes-admin@kubernetes
    current-context: kubernetes-admin@kubernetes
    kind: Config
    preferences: {}
    users:
  • name: kubernetes-admin
    user:
    client-certificate-data: DATA+OMITTED
    client-key-data: DATA+OMITTED

Other signals:

$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Wed 2023-02-15 13:59:30 UTC; 2h 11min ago
Docs: \https://kubernetes.io/docs/home/
Main PID: 954 (kubelet)
Tasks: 11 (limit: 4460)
Memory: 42.3M
CPU: 1min 14.799s
CGroup: /system.slice/kubelet.service
└─954 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubel>

Feb 15 16:11:17 k8smaster.myserver.dev kubelet[954]: E0215 16:11:17.699246 954 kuberuntime_sandbox.go:45] "Failed to generate sand>
Feb 15 16:11:17 k8smaster.myserver.dev kubelet[954]: E0215 16:11:17.699277 954 kuberuntime_manager.go:782] "CreatePodSandbox for p>
Feb 15 16:11:17 k8smaster.myserver.dev kubelet[954]: E0215 16:11:17.699393 954 pod_workers.go:965] “Error syncing pod, skipping” e>
Feb 15 16:11:18 k8smaster.myserver.dev kubelet[954]: E0215 16:11:18.844142 954 eviction_manager.go:261] "Eviction manager: failed >
Feb 15 16:11:22 k8smaster.myserver.dev kubelet[954]: E0215 16:11:22.573311 954 controller.go:146] failed to ensure lease exists, w>
Feb 15 16:11:23 k8smaster.myserver.dev kubelet[954]: E0215 16:11:23.698518 954 kuberuntime_sandbox.go:45] "Failed to generate sand>
Feb 15 16:11:23 k8smaster.myserver.dev kubelet[954]: E0215 16:11:23.698742 954 kuberuntime_manager.go:782] "CreatePodSandbox for p>
Feb 15 16:11:23 k8smaster.myserver.dev kubelet[954]: E0215 16:11:23.698851 954 pod_workers.go:965] “Error syncing pod, skipping” e>
Feb 15 16:11:24 k8smaster.myserver.dev kubelet[954]: I0215 16:11:24.246675 954 kubelet_node_status.go:70] "Attempting to register >
Feb 15 16:11:24 k8smaster.myserver.dev kubelet[954]: E0215 16:11:24.247335 954 kubelet_node_status.go:92] "Unable to register node>

$ netstat -a | grep 6443
$ sudo netstat -lnpt|grep kube
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 954/kubelet
tcp6 0 0 :::10250 :::* LISTEN 954/kubelet

K8s info

$ kubectl version -o json
{
  "clientVersion": {
    "major": "1",
    "minor": "26",
    "gitVersion": "v1.26.1",
    "gitCommit": "8f94681cd294aa8cfd3407b8191f6c70214973a4",
    "gitTreeState": "clean",
    "buildDate": "2023-01-18T15:58:16Z",
    "goVersion": "go1.19.5",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "kustomizeVersion": "v4.5.7"
}

Ubuntu:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.1 LTS
Release:	22.04
Codename:	jammy

I say this worked originally but not really, I was only ever able to execute the kubectl command as root user on the master. I could not run kubectl commands on any of the workers.

Still, something was working and now it’s not. I suspect if I restore my vm to a previous snapshot it will work again but I’m curious to understand what went wrong here. Any ideas?

Normally 192.168.x.x IPs are used in home networks with DHCP.

Are the IP addresses still the same?