Issues with KubeProxy Network Programming Duration

pablokbs · April 23, 2020, 10:22pm

Hello.

I’m having an issue in all of my clusters. The metric "kubeproxy_network_programming_duration_seconds_bucket
" shows the amount of time that kube-proxy takes to sync the ipvs rules in the nodes to reflect the changes in etcd (for example, a pod being removed as an endpoint from a service)

This is causing timeouts on my requests, as some of them go to pods that are already dead.

I can reproduce the error when I restart a lot of pods, but I’ve checked the system metrics of etcd, controllers and nodes, and I can’t find any service being short of CPU of memory. There is no way to add more logging to kube-proxy. Where can I find more information about this? If I google this metric name, I can’t find anything helpful, so no one is using it

So yes, the p99 is 10!!! seconds. That means that a pod that’s no longer an endpoint of a service, can still receive traffic 10 seconds after the SIGTERM is sent to the pod.

I’ve confirmed this behavior by adding a PreStop exec command (a 12s sleep). This sleep allows the pod to keep receiving requests even after is marked as “terminating” and removed from the service endpoints

Cluster information:

Kubernetes version: 1.16
Cloud being used: AWS
Installation method: user-data scripts
Host OS: CoreOS
CNI and version: Flannel
CRI and version: Docker 18.06

gamunu · September 22, 2020, 12:12pm

@pablokbs have you find any solutions to this? We have recently identified this. I wonder these two issues are related.

github.com

kubernetes/community/blob/master/sig-scalability/slos/network_programming_latency.md

## Network programming latency SLIs/SLOs details

### Definition

| Status | SLI | SLO |
| --- | --- | --- |
| __WIP__ | Latency of programming in-cluster load balancing mechanism (e.g. iptables), measured from when service spec or list of its `Ready` pods change to when it is reflected in load balancing mechanism, measured as 99th percentile over last 5 minutes aggregated across all programmers<sup>[1](#footnote1)</sup> | In default Kubernetes installation, 99th percentile per cluster-day <= X |

<a name="footnote1">[1\]</a>Aggregation across all programmers means that all
samples from all programmers go into one large pool, and SLI is percentile
from all of them.

### User stories
- As a user of vanilla Kubernetes, I want some guarantee how quickly new backends
of my service will be targets of in-cluster load-balancing
- As a user of vanilla Kubernetes, I want some guarantee how quickly deleted
(or unhealthy) backends of my service will be removed from in-cluster
load-balancing
- As a user of vanilla Kubernetes, I want some guarantee how quickly changes
to service specification (including creation) will be reflected in in-cluster

This file has been truncated. show original

Update: It seems like this metric has issues with recording accurate values… As documented in SLI/SLO Caveats and here https://github.com/kubernetes/kubernetes/issues/82378

Topic		Replies	Views
Api-server times out when inserting pods spec into etcd General Discussions	2	4188	April 7, 2022
How has Kubernetes failed for you? General Discussions	6	4586	February 10, 2019
Intermittent Time Out issue between POD to POD communication General Discussions network	0	516	March 7, 2024
Stuck with nginx ingress controller General Discussions network	1	2569	July 20, 2023
Traffic to a Pod located in a Dead Node General Discussions	2	1753	August 23, 2019

Issues with KubeProxy Network Programming Duration

Cluster information:

Related topics