Asking for help? Comment out what you need so we can get more information to help you!
Cluster information:
Kubernetes version: 1.22.11
Cloud being used: Azure/AKS
Installation method:
Host OS: Linux/Windows
CNI and version: Azure CNI
CRI and version: containerd (containerd-2022.09.13)
We have several K8s clusters running in AKS. We have Linux and Windows node pools and deployments running on each. We are using the node autoscaler to autoscale node pools, and HPA’s or Keda to autoscale most of our deployments. Some deployments use HPA, some use Keda.
We are troubleshooting random 502’s with most of our apps running in Kubernetes. In our troubleshooting, we found in the kube-proxy logs, that the EndpointSliceCache is being set to only 1 IP address, even though we have multiple pods running behind the service. This is an example message we are seeing in the logs:
2022-10-15T08:56:04.570885342Z stderr F I1015 08:56:04.570747 1 endpointslicecache.go:366] Setting endpoints for “mynamespace/myservice” to [10.180.160.124:8080 ]
This particular deployment had 2 pods running at this time. Should we be concerned about this? it seems like we would never want to have only 1 IP address in the Endpointslicecache when there are multiple pods running, as this would defeat the purpose of HA (at the pod level).