I have a really weird issue with one of my Linode K8s clusters running 1.23, there are multiple issues occuring and I can’t quite pinpoint the root cause.
Linode have let me know it is not a issue with the master and nothing on there end, let me highlight all the identified problems to start.
Logs not Working
When trying to pull logs from any pods I get this error (which makes it very hard to troubleshoot)
root@aidan:~# kubectl logs <pod-name> -n revwhois-subdomain-enum
Error from server: Get "https://192.168.150.102:10250/containerLogs/revwhois-subdomain-enum/tldbrr-revwhois-worker12-twppv/tldbrr-revwhois-worker12": dial tcp 192.168.150.102:10250: i/o timeout
Metrics not Working
root@m0chan:~# kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Pod Deletion not Working
When deleting a pod with kubectl delete pod <pod-name> - <namespace>
- it will delete the pod however it is stuck in a terminating state, the old pod is not deleted and anew pod is not launched.
Errors Editing Ingress
Error from server (InternalError): error when creating "yaml/ingress/argo-ingress.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: Temporary Redirect
I also have errors on Metrics logs and Cert-Manager logs relating to failed calling webhook
This is all for now and I would really appreciate some help resolving this.
Aidan