Healthz problems on init (cluster of 3 master+2workers+1load balancer)

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: 1.27
Cloud being used: local
Installation method:
Host OS: centos 7
CNI and version: not yet
CRI and version: not yet

trying to do HA according to the guide, using the haproxy.cfg option. as an external server.

the control end point keeps getting connection refused, on healthz checks?
when trying to create the static control plane.

here are its attempts, when doing kubeadm init(according to this website’s guide) :
[wait-control-plane] Waiting for the API server to be healthy
I0809 21:57:25.443950 6708 loader.go:373] Config loaded from file: /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
I0809 21:57:25.446016 6708 round_trippers.go:463] GET https://192.168.14.179:6443/healthz?timeout=10s
I0809 21:57:25.446072 6708 round_trippers.go:469] Request Headers:
I0809 21:57:25.446144 6708 round_trippers.go:473] Accept: application/json, /
I0809 21:57:25.446203 6708 round_trippers.go:473] User-Agent: kubeadm/v1.27.4 (linux/amd64) kubernetes/fa3d799
I0809 21:57:25.451832 6708 round_trippers.go:574] Response Status: in 5 milliseconds
I0809 21:57:25.451909 6708 round_trippers.go:577] Response Headers:
I0809 21:57:26.456251 6708 with_retry.go:234] Got a Retry-After 1s response for attempt 1 to https://192.168.14.179:6443/healthz?timeout=10s
I0809 21:57:26.456988 6708 round_trippers.go:463] GET https://192.168.14.179:6443/healthz?timeout=10s
I0809 21:57:26.457050 6708 round_trippers.go:469] Request Headers:
I0809 21:57:26.457111 6708 round_trippers.go:473] Accept: application/json, /
I0809 21:57:26.457170 6708 round_trippers.go:473] User-Agent: kubeadm/v1.27.4 (linux/amd64) kubernetes/fa3d799
I0809 21:57:26.463103 6708 round_trippers.go:574] Response Status: in 5 milliseconds
I0809 21:57:26.463171 6708 round_trippers.go:577] Response Headers:


so these same errors goes on and on until it fails.

I have checked non-connectivity (nc -v) balancer to node, and its as demanded.

obviously I’ve set up the haproxy.cfg to match my static ip’s of my nodes.

I’ve tried different guides from google, but all brought the same error.

I0809 21:57:27.465139 6708 with_retry.go:234] Got a Retry-After 1s response for attempt 2 to

fixed.
somehow my swapoff -a and the fstab changes i already did, were undo’d on reboot… so that init works.
although it has no CRI?

having another error, with unable to pull image on master2…(master 1 init, master3 was joined succesfully)
image exists: /kube-controller-manager:v1.27.4
I0810 14:48:58.769738 7968 checks.go:846] image exists: /kube-scheduler:v1.27.4
I0810 14:48:58.970691 7968 checks.go:846] image exists: /kube-proxy:v1.27.4
W0810 14:48:59.167694 7968 checks.go:835] detected that the sandbox image “registry.k8s.io/pause:3.6” of the container runtime is inconsistent with that used by kubeadm. It is recommended that using “registry.k8s.io/pause:3.9” as the CRI sandbox image.
I0810 14:48:59.340139 7968 checks.go:846] image exists: /pause:3.9
I0810 14:48:59.488045 7968 checks.go:846] image exists: /etcd:3.5.7-0
I0810 14:48:59.680870 7968 checks.go:846] image exists: /coredns/coredns:v1.10.1
[preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image registry.k8s.io/kube-apiserver:v1.27.4: output: E0810 14:48:58.445690 8016 remote_image.go:171] “PullImage from image service failed” err="rpc error: code = NotFound desc = failed to pull and unpack image "registry.k8s.io/kube-apiserver:v1.27.4": failed to extract layer

I took my time in doing the third join…on the master2 i’m having trouble with joining(these errors).
could it be that the token has expired? i even created new certificate and pasted it to the join…
also-i’m pretty new on this as you can imagine- should i create the new certificate on the node i’m trying to join, or the init one? (i’m thinking in ssh terms, and i know its wrong)

this is very weird. i can do ctr image pull to that very same image on that very same node. but the join command couldnt???

i get failed etcd component to install…anyone with a clue how to discover the cause?

(this upon doing join master to 2 other masters, which are working):

[etcd] Announced new etcd member joining to the existing etcd cluster
I0811 15:11:18.225075 1639 local.go:165] Updated etcd member list: [{master1 https://192.168.14.180:2380} {master2 https://192.168.14.181:2380} {master3 https://192.168.14.182:2380}]
[etcd] Creating static Pod manifest for “etcd”
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I0811 15:11:18.226110 1639 etcd.go:585] [etcd] attempting to see if all cluster endpoints ([https: https: https:/]) are available 1/8
I0811 15:11:20.924161 1639 etcd.go:565] Failed to get etcd status for https://: failed to dial endpoint https: with maintenance client: context deadline exceeded
I0811 15:11:23.417981 1639 etcd.go:565] Failed to get etcd status for https:: failed to dial endpoint https:/ with maintenance client: context deadline exceeded
I0811 15:11:25.758658 1639 etcd.go:565] Failed to get etcd status for https: failed to dial endpoint https://192.168.14.181:2379 with maintenance client: context deadline exceeded

solution:
create a new machine(not cloned) on VM box, and retry retry.