`kubeadm join` fails randomly

Cluster information:

Kubernetes version: 1.20.5
Cloud being used: AWS
Installation method: Creating Highly Available clusters with kubeadm from the documentation. The whole infrastructure is in a Terraform project and I can 100% repeat the issue.
Host OS: Ubuntu 20.04
CNI and version: calico (0.3.1)
CRI and version: containerd (1.3.3-0ubuntu2.3)

Hey I have an issue where joining a node to the control plane succeeds but the kube-apiserver enters a crash loop. What I find weird is the issue appears to be completely random. If I just reset the node using kubeadm reset -f and try kubeadm join again, there is about a 50% chance that it will succeeds. For the logs below, 10-17-83-85 is the broken node. 10-17-80-97 is healthy.

LOGS
kubectl describe pod -n kube-system kube-apiserver-ip-10-17-83-85
kubectl get --raw=‘https://10.17.83.85:6443/livez?verbose’ - from 10-17-83-85
curl -k https://10.17.83.85:6443/livez?verbose - from both nodes → connection refused???
journalctl -xeu kubelet - from 10-17-83-85
kubectl get pods -n kube-system

I’m out of ideas on how to debug this. Anyone think they can help? Thanks

There’s open issue about this

What makes you think it is the same issue? I’ve read the whole thread and my issue seems completely different.

I have similar problem.
Fresh install of ubuntu 22.04, kubeadm v1.24.2, using containerd (not docker-shim).

After kubelet is restarted, it works between several seconds to minutes and then fails at random period of time.
Smoking gun is this line:
root@k8s04:/var/log/pods# grep -ri "Shutting down kubernetes service endpoint reconciler" *

I tried to add “–endpoint-reconciler-type=none” and “–endpoint-reconciler-type=lease” to
/etc/kubernetes/manifests/kube-apiserver.yaml
with no visible difference…