Kubernetes version:1.28.2
Cloud being used: AWS
Installation method: command line
Host OS: Ubuntu
I have installed Kubernetes on 3 servers, one control plane and two workers. if I issue kubectl get nodes I am receiving error: The connection to the server 172.31.20.146:6443 was refused - did you specify the right host or port?
Once I reboot server , I can see all my nodes:
ubuntu@controlplane:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controlplane Ready control-plane 136m v1.28.2
worker1 Ready <none> 109m v1.28.2
worker2 Ready <none> 108m v1.28.2
ubuntu@controlplane:~$ kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 136m
But after a minute it is going down and I can see the error again: The connection to the server 172.31.20.146:6443 was refused - did you specify the right host or port?
I have enough resources, like CPU and Meomory.
I was on the call with AWS support and they confirmed that security groups have the correct access
There are no issues with any of the VPC components for this EC2 instance. Total info as follows: - Security Group allows all inbound traffic from 172.31.0.0/16 and allows all outbound traffic. - Network ACL is open for all incoming/outgoing traffic. - Route Table routes VPC traffic 172.31.0.0/16 locally.
The issue you’re encountering, where the Kubernetes API server at 172.31.20.146:6443 becomes inaccessible shortly after a reboot, suggests a problem that could be related to the Kubernetes control plane components themselves, networking on your host, or resource constraints that only manifest under certain conditions.
After rebooting and when the nodes are visible, quickly check the status of all control plane components. You can do this by running the following command on your control plane node:
bashCopy code
sudo kubectl get pods --namespace=kube-system
Look for any components that are not in the Running state or are showing repeated restarts.
Feb 16 13:15:17 controlplane kubelet[19063]: E0216 13:15:17.265276 19063 run.go:74] "command failed" err="failed to l>
Feb 16 13:15:17 controlplane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Feb 16 13:15:17 controlplane systemd[1]: kubelet.service: Failed with result 'exit-code'.
Feb 16 13:15:27 controlplane systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 9.
Feb 16 13:15:27 controlplane systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Feb 16 13:15:27 controlplane systemd[1]: Started kubelet: The Kubernetes Node Agent.
Feb 16 13:15:27 controlplane kubelet[19069]: E0216 13:15:27.517399 19069 run.go:74] "command failed" err="failed to l>
Feb 16 13:15:27 controlplane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Feb 16 13:15:27 controlplane systemd[1]: kubelet.service: Failed with result 'exit-code'.
Feb 16 13:15:37 controlplane systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 10.
Feb 16 13:15:37 controlplane systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Feb 16 13:15:37 controlplane systemd[1]: Started kubelet: The Kubernetes Node Agent.
Feb 16 13:15:37 controlplane kubelet[19075]: E0216 13:15:37.771138 19075 run.go:74] "command failed" err="failed to l>
Feb 16 13:15:37 controlplane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Feb 16 13:15:37 controlplane systemd[1]: kubelet.service: Failed with result 'exit-code'.
Feb 16 13:15:47 controlplane systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 11.
journalctl -u kubelet | grep kube-apiserver
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.489152 12745 topology_manager.go:215] "Topology Admit Handler" podUID="b9adef3da2c2babf52f5eae940481211" podNamespace="kube-system" podName="kube-apiserver-controlplane"
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.594373 12745 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b9adef3da2c2babf52f5eae940481211-usr-share-ca-certificates\") pod \"kube-apiserver-controlplane\" (UID: \"b9adef3da2c2babf52f5eae940481211\") " pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.594603 12745 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/b9adef3da2c2babf52f5eae940481211-ca-certs\") pod \"kube-apiserver-controlplane\" (UID: \"b9adef3da2c2babf52f5eae940481211\") " pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.594759 12745 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b9adef3da2c2babf52f5eae940481211-etc-ca-certificates\") pod \"kube-apiserver-controlplane\" (UID: \"b9adef3da2c2babf52f5eae940481211\") " pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.594794 12745 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/b9adef3da2c2babf52f5eae940481211-k8s-certs\") pod \"kube-apiserver-controlplane\" (UID: \"b9adef3da2c2babf52f5eae940481211\") " pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:10 controlplane kubelet[12745]: I0219 10:49:10.594825 12745 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b9adef3da2c2babf52f5eae940481211-usr-local-share-ca-certificates\") pod \"kube-apiserver-controlplane\" (UID: \"b9adef3da2c2babf52f5eae940481211\") " pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:40 controlplane kubelet[12745]: E0219 10:49:40.668709 12745 kubelet.go:1890] "Failed creating a mirror pod for" err="Post \"https://172.31.20.146:6443/api/v1/namespaces/kube-system/pods\": dial tcp 172.31.20.146:6443: connect: connection refused" pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:41 controlplane kubelet[12745]: E0219 10:49:41.400598 12745 kubelet.go:1890] "Failed creating a mirror pod for" err="Post \"https://172.31.20.146:6443/api/v1/namespaces/kube-system/pods\": dial tcp 172.31.20.146:6443: connect: connection refused" pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:43 controlplane kubelet[12745]: E0219 10:49:43.725226 12745 kubelet.go:1890] "Failed creating a mirror pod for" err="pods \"kube-apiserver-controlplane\" already exists" pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:44 controlplane kubelet[12745]: E0219 10:49:44.423426 12745 kubelet.go:1890] "Failed creating a mirror pod for" err="pods \"kube-apiserver-controlplane\" already exists" pod="kube-system/kube-apiserver-controlplane"
Feb 19 10:49:50 controlplane kubelet[12745]: E0219 10:49:50.677600 12745 kubelet.go:1890] "Failed creating a mirror pod for" err="pods \"kube-apiserver-controlplane\" already exists" pod="kube-system/kube-apiserver-controlplane"
Hi,
Calico cannot connect to kubeapi server. There must be a reason of the kubelet failure. You need to have it in the running state, otherwise static pods will not start.
Can you check /var/log/syslog file? There should be some info related to kubelet.