Cluster information:
Kubernetes version: 1.20.5
Cloud being used: AWS
Installation method: Creating Highly Available clusters with kubeadm from the documentation. The whole infrastructure is in a Terraform project and I can 100% repeat the issue.
Host OS: Ubuntu 20.04
CNI and version: calico (0.3.1)
CRI and version: containerd (1.3.3-0ubuntu2.3)
Hey I have an issue where joining a node to the control plane succeeds but the kube-apiserver enters a crash loop. What I find weird is the issue appears to be completely random. If I just reset the node using kubeadm reset -f and try kubeadm join again, there is about a 50% chance that it will succeeds. For the logs below, 10-17-83-85 is the broken node. 10-17-80-97 is healthy.
LOGS
kubectl describe pod -n kube-system kube-apiserver-ip-10-17-83-85
kubectl get --raw=‘https://10.17.83.85:6443/livez?verbose’ - from 10-17-83-85
curl -k  https://10.17.83.85:6443/livez?verbose - from both nodes → connection refused???
journalctl -xeu kubelet - from 10-17-83-85
kubectl get pods -n kube-system
I’m out of ideas on how to debug this. Anyone think they can help? Thanks