Hi, first time posting here, so i hope i am at the right place.
Cluster information:
Kubernetes version: 1.31.2
Cloud being used: AKS
Installation method: Terraform
Host OS: Ubuntu 22.04.5 LTS (AKSUbuntu-2204gen2containerd-202412.10.0)
CNI and version: calico 3.28.2
CRI and version: containerd 1.7.23-1
Kube-Proxy version: 1.31.2
Issue
I noticed a high CPU usage (~1000m per instance) of kube-proxy after upgrading the Azure Kubnetes Cluster from 1.30.6 to 1.31.2. The logs show that kube-proxy is updating the Endpoints every second.
Example:
Example: I0107 02:48:02.195767 1 endpointslicecache.go:303] "Setting endpoints for service port name" portName="test/postgres-postgres:postgresql" endpoints=["10.128.42.153:5432"]
It only happens for our postgres-instances which we deploy with zalando-operator. And i saw that optime
.renewTime
and slots
also update every second on the endpoints.
apiVersion: v1
kind: Endpoints
metadata:
annotations:
acquireTime: "2025-01-07T08:07:07.213129+00:00"
leader: postgres-postgres-0
optime: "160141489064"
renewTime: "2025-01-07T14:46:35.380109+00:00"
retain_slots: '["postgres_postgres_0"]'
slots: '{"postgres_postgres_0":160141489064}'
transitions: "15"
ttl: "30"
creationTimestamp: "2024-12-13T07:22:52Z"
labels:
application: spilo
cluster-name: postgres-postgres
name: postgres-postgres
namespace: test
resourceVersion: "113802184"
uid: e8c588a1-5646-4103-b3ba-71e3a35e794f
subsets:
- addresses:
- hostname: postgres-postgres-0
ip: 10.128.42.153
nodeName: aks-system-38846450-vmss00000o
targetRef:
kind: Pod
name: postgres-postgres-0
namespace: test
resourceVersion: "113582861"
uid: b03e00d8-492a-4bb3-953d-5117cedc114c
ports:
- name: postgresql
port: 5432
protocol: TCP
This issue was not a thing, or at least not visible with aks 1.30.6:
Log volume of kube-proxy
CPU Usage of kube-proxy
Is it possible to tell kube-proxy to ignore those changing annotations? or do i have to open an issue for the zalando-operator? Or is it even an issue of kube-proxy?
Thx for the help in advance!