Asking for help? Comment out what you need so we can get more information to help you!
Cluster information:
Kubernetes version: 1.19
Cloud being used: Bare Metal
Installation method: Kubespray
Host OS: Centos7, Centos8, Ubuntu 20.04
CNI and version: Kube-router 1.2
CRI and version: docker 19.03
Machines: 2
I’ve checked out latest master of Kubespray (27/03/20201) and configured for 1 control pane only in hosts file. Thereafter, I modified k8s-cluster config to install “kube-router” and use “ipvs”.
After completion of kubespray provisioning - I run “kubectl get nodes” and the control pane and worker node are in ready state and working with no errors. Also checked “journalctl -xe” and there were no errors.
If I do curl “masterip:6443” on both the control pane and worker node I get
Client sent an HTTP request to an HTTPS server.
So far so good and beautfilly connected and working.
However, the BIG issue arises…
Now I deploy an ingress controller NGINX - tried via both helm and also kubectl method…but before I deploy it, I put a watch on worker node of the curl command and then deployed it.
curl “masterip:6443”
I get for a few seconds
Client sent an HTTP request to an HTTPS server.
As it deploys suddenly get the following on worker node:
curl: (7) Failed connect to masterip:6443; Connection refused
And now if I do kubectl get nodes I see worker node in “notready” state. So I do “journalctl -xe” and I see lines full of
10s": dial tcp masterip:6443: connect: connection refused
To be noted:
- If I go back to the control pane and other machines in the same network run the curl I get:
curl “masterip:6443”
Client sent an HTTP request to an HTTPS server.
Which says it’s reachable on 6443, but inside worker node suddenly is no longer reachable.
Why was worker node able to reach, then suddenly it’s not reaching? What is doing this and how to fix?
- Working on iptables mode (not ipvs) - works completely fine.
- This same behaviour occurs on Centos7 and Centos8.
- I’ve run kube-proxy cleanup and still occurs
- I’ve run iptables -F for a test purpose - this I can’t curl inside worker node port 6443 of masterip, but I can access from any other machines.
- If I run on worker node “ipvsadm -ln”
→ masterip:6443 Masq 1 0 2
I see in the InActConn column 2 - why?
I’ve tested using other CNI e.g. flannel also occurs.
Summary: Cluster provisioning works until as soon as ingress controller is deployed e.g. nginx, the worker node stops being able to access masterip:6443 - prior to deployment it can access.
Please let me know for further information and how to resolve.