`kubeadm join` fails randomly

jfgauron · April 3, 2021, 11:39pm

Cluster information:

Kubernetes version: 1.20.5
Cloud being used: AWS
Installation method: Creating Highly Available clusters with kubeadm from the documentation. The whole infrastructure is in a Terraform project and I can 100% repeat the issue.
Host OS: Ubuntu 20.04
CNI and version: calico (0.3.1)
CRI and version: containerd (1.3.3-0ubuntu2.3)

Hey I have an issue where joining a node to the control plane succeeds but the kube-apiserver enters a crash loop. What I find weird is the issue appears to be completely random. If I just reset the node using kubeadm reset -f and try kubeadm join again, there is about a 50% chance that it will succeeds. For the logs below, 10-17-83-85 is the broken node. 10-17-80-97 is healthy.

LOGS
kubectl describe pod -n kube-system kube-apiserver-ip-10-17-83-85
kubectl get --raw=‘https://10.17.83.85:6443/livez?verbose’ - from 10-17-83-85
curl -k https://10.17.83.85:6443/livez?verbose - from both nodes → connection refused???
journalctl -xeu kubelet - from 10-17-83-85
kubectl get pods -n kube-system

I’m out of ideas on how to debug this. Anyone think they can help? Thanks

kron4eg · April 4, 2021, 7:45am

There’s open issue about this

jfgauron · April 4, 2021, 12:40pm

What makes you think it is the same issue? I’ve read the whole thread and my issue seems completely different.

Arie_Skliarouk · June 23, 2022, 7:42am

I have similar problem.
Fresh install of ubuntu 22.04, kubeadm v1.24.2, using containerd (not docker-shim).

After kubelet is restarted, it works between several seconds to minutes and then fails at random period of time.
Smoking gun is this line:
root@k8s04:/var/log/pods# grep -ri "Shutting down kubernetes service endpoint reconciler" *

I tried to add “–endpoint-reconciler-type=none” and “–endpoint-reconciler-type=lease” to
/etc/kubernetes/manifests/kube-apiserver.yaml
with no visible difference…

Topic		Replies	Views
Kube join failing General Discussions	0	1017	February 6, 2020
Cluster failed when I want add a new master node to cluster with kubeadm General Discussions	2	1770	March 12, 2021
Kubeadm init: Everything crash after several CrashLoopBackOff General Discussions	8	11584	March 3, 2023
Worker node cant joon admin node General Discussions	5	6230	July 12, 2019
Joining a node to the cluster using kubeadm join General Discussions	0	675	November 26, 2018

`kubeadm join` fails randomly

Cluster information:

Related topics