CKS Lab K8S installation dying from the very start

mihai · July 22, 2024, 9:25am

Hi,

I’m trying to deploy K8S Control Plane on a c5.xlarge instance in AWS with Ubuntu20 and failing miserably at it.

The cluster seems to be installed and working for a while but then the api server dies.
Also I see in kube-system namespace containers going into CrashLoop:

ubuntu@ip-10-10-100-179:~$ kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS             RESTARTS         AGE
kube-system   cilium-5tlch                               0/1     CrashLoopBackOff   14 (35s ago)     35m
kube-system   cilium-operator-65496b9554-792xm           1/1     Running            22 (86s ago)     35m
kube-system   coredns-7db6d8ff4d-5d6ls                   0/1     Pending            0                39m
kube-system   coredns-7db6d8ff4d-ztrt5                   0/1     Pending            0                39m
kube-system   etcd-ip-10-10-100-179                      1/1     Running            20 (2m53s ago)   35m
kube-system   kube-apiserver-ip-10-10-100-179            1/1     Running            27 (119s ago)    39m
kube-system   kube-controller-manager-ip-10-10-100-179   0/1     CrashLoopBackOff   30 (36s ago)     36m
kube-system   kube-proxy-snzr2                           1/1     Running            20 (63s ago)     39m
kube-system   kube-scheduler-ip-10-10-100-179            1/1     Running            30 (2m53s ago)   40m

Swap is off (AWS cloud instances have it off).
Disk is not full.

This one keeps dying out and being restarted:

ubuntu@ip-10-10-100-179:~$ kubectl get pods -A
Get "https://10.10.100.179:6443/api/v1/pods?limit=500": dial tcp 10.10.100.179:6443: connect: connection refused - error from a previous attempt: read tcp 10.10.100.179:55826->10.10.100.179:6443: read: connection reset by peer

Not sure anymore what to try or where to look in which log.

syslog:2024-07-22T09:22:00.492140+00:00 ip-10-10-100-179 kubelet[20988]: E0722 09:22:00.492086   20988 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 20s restarting failed container=kube-apiserver pod=kube-apiserver-ip-10-10-100-179_kube-system(1b3b3892f733afc426e9fd57ef4aa6b2)\"" pod="kube-system/kube-apiserver-ip-10-10-100-179" podUID="1b3b3892f733afc426e9fd57ef4aa6b2"

syslog:2024-07-22T09:19:43.388388+00:00 ip-10-10-100-179 kubelet[14601]: E0722 09:19:43.387968   14601 kuberuntime_container.go:784] "Container termination failed with gracePeriod" err="rpc error: code = Unavailable desc = error reading from server: EOF" pod="kube-system/kube-apiserver-ip-10-10-100-179" podUID="1b3b3892f733afc426e9fd57ef4aa6b2" containerName="kube-apiserver" containerID="containerd://20e419f91fcefe27317feba35fd7f83d8fdf783d6e242af075c44f917cce1866" gracePeriod=30

Anyone hit this before?
I’m confused a bit about what I should do next.

I also tried a reinstallation using the k8scp.sh script but same problem as always:

Error: Unable to install Cilium: Kubernetes cluster unreachable: Get "https://10.10.100.172:6443/version": dial tcp 10.10.100.172:6443: connect: connection refused

Cilium install finished. Continuing with script."

The API Server is already dying since the beginning.

Digging more:

[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
error execution phase addon/coredns: unable to create a new DNS service: rpc error: code = Unknown desc = malformed header: missing HTTP content-type
To see the stack trace of this error execute with --v=5 or higher

Topic		Replies	Views
Kubernetes API Server crashes all the time General Discussions apiserver	2	3387	June 17, 2024
AWS Unable to connect to the Kubernetes API server on port 6443 General Discussions	7	4233	February 20, 2024
What causes kube-apiserver to (lag on startup) (die on idle)? General Discussions apiserver	3	3115	December 7, 2022
Kubernetes API service not responding General Discussions	5	30752	March 29, 2024
Issues with K8 apiservice after reboot General Discussions development	1	672	November 24, 2020

CKS Lab K8S installation dying from the very start

Related topics