I’ve been trying to debug an issue we’ve been having on our cluster for the past weeks.
We are using EKS flavor on AWS (version 1.20) to deploy and run our stacks and use calico policies to isolate the namespaces from each other. All of the stacks consist of services deployed with clusterips and use aws vpc as cni. So far the EKS behaved really well and integrations with aws services in general are quite good and easy to do.
However we had the issue that pods running on the windows node could not connect to other services running within the same namespace. After a lot of testing it turns out the calico policies are responsible for tripping kube-proxy on windows to fail creation of the hns-loadbalancer-policy inside the l2bridge.
I tried to track everything i’ve been doing on the ticket: https://github.com/kubernetes-sigs/sig-windows-tools/issues/168
which was possible the wrong place to create the ticket but i was young and dumb back then 2 days ago.
So far I don’t know if this is expected behaviour on kube-proxies side to ignore endpoints that have ANY policy or that this is a bug. Due to the lack of actual tooling on windows (besides hnsdiag) to figure out the details of whats really going on this is not very to do a deep dive on, also I lack the knowledge of compiling my own kube-proxy to see what is happening.
The only other option would be to implement network isolation in another “supported” way.