Client is disconnected when the pod is terminating or fails the readiness probe

Junghoon · March 21, 2022, 12:15pm

Hi,

I’m managing one server on k8s which serves HTTP API consuming quite a long time to respond to it.
Pods are deployed as StatefulSet and use RollingUpdate as an update strategy.
Also, the type of service is LoadBalancer.
For the maintenance, when I update my server, the pod should wait for all the requests to be responded before exiting. (I mean graceful shutdown.)

I read following articles:

After I read them, my understanding about the process of pod termination is here:

Change the status of pod to Terminating and remove it from service endpoints.
: When pod comes to this status, LoadBalancer doesn’t send new requests to this pod.
Execute preStop phase if available.
Send SIGTERM to pod.
Wait for pod terminates before terminationGracePeriod.
If terminationGracePeriod is expired, send SIGKILL to pod.

At step 1, I thought that LoadBalancer will not send new requests to this pod, but also it will NOT disconnect the connections which are established before this step.
However, in my environment, it closes all the client connections and clients get connection reset by peer error.
On the server side, the server isn’t aware of it and it tries to write a response to the closed connection and is blocked.
Regardless of the process of termination, I’m experiencing the same thing when I just make the pod fail the readiness probe while processing the requests from the clients.

I’m using the internal k8s platform in my company and I asked the same issue to its managers.
They said closing client connections when the pod is removed from service endpoints is the official spec of k8s.
However, I think keeping connections and letting the pod handle them gracefully are more reasonable.

Could you guys please confirm whether it is truly a spec of k8s or not?
There are several docs which say pods will not receive new connections in Terminating or Not-Ready status, but it is hard to find an official doc that says already established connections will be closed or not.
Also could you guys suggest some points or ways that I or our platform managers can try on settings of k8s to slove this issue?

Thanks!

Cluster information:

Kubernetes version: v1.15.10
I’m sorry but, as I’m using the internal k8s platform in my company as I said above, the detailed cluster information is invisible to me.

xavi · March 22, 2022, 8:05pm

Hi Junghoon:

From the Pod Lifecycle you’ve provided:

Pods that shut down slowly cannot continue to serve traffic as load balancers (like the service proxy) remove the Pod from the list of endpoints as soon as the termination grace period begins.

As the pod is removed as a valid endpoint, your client gets a connection reset by peer.

I am no developer, but regarding the 12 factor app

Processes shut down gracefully when they receive a SIGTERM signal from the process manager. For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost.

Your description seems to fit in the “long polling” scenario described here, so maybe the application can be updated to retry the un-processed request (on a different pod).

Best regards,

Xavi

thockin · March 22, 2022, 8:51pm

The unfortunate answer is that it was under-defined. Both behaviors exist.

With the rise of EndpointSlice, we have more metadata to work with, and sig-net is discussing what the ideal behavior should be. That said, we can’t just expect everyone to change their implementations over night. There’s got to be some amount of “implementation defined” freedom.

In MY opinion, connections MUST survive while an endpoint is marked as terminating but MAY be killed when an endpoint is removed. To do that cleanly, we have open KEPs to track that intermediate state.

Junghoon · March 23, 2022, 6:31am

Hi, Xavi and Thockin.

Thank you for answering my question.

It looks controversial to say removing a pod from endpoints implies closing the client connections.
I tested the same server in the other k8s environment, but it didn’t close the client connections when the pod was removed from endpoints.
Therefore, I think saying the answer is undefined or under-defined is correct as thockin said.

I’m not sure which part of the environment decides the behavior, but I hope someday there is an option in k8s to choose behavior explicitly or can add hooks like preStop before removing pods from endpoints.
For now, I think I have to find other ways to avoid this issue.

Thanks a lot!

thockin · March 23, 2022, 8:38pm

I tested the same server in the other k8s environment, but it didn’t close the client connections when the pod was removed from endpoints.

Today we just don’t spec that, and so implementations do what they want. But also, today we do not disitinguish “this endpoint exists but is terminating” from “this endpoint doesn’t exist”. Once we have that, I think implementations can be smarter.

I’m not sure which part of the environment decides the behavior, but I hope someday there is an option in k8s to choose behavior explicitly

It’s a combination of the service proxy (kube-proxy, usually) and the LB implementation. I don’t want to add parameters here, but as I said - I think more metadata will allow better impl choices. Coming soon.

violuke · April 30, 2024, 8:12am

Is there any update on this?

Do you have any links to these?

We’ve built an update recently on the assumption that in-progress connections would not be closed and they would be given some time to complete, so this has caught us out. It’s fair to say we made a mistake in our understanding, but it was also a “reasonable” assumption for connections to stay open, when phrasing like “remove the Pod from the list of endpoints” is used. If there have been any updates that might allow us to configure how this works, that would be extremely helpful. Or any suggested alternative solutions.

We’re using k8s version 1.29.1 (DigitalOcean’s DOKS).

Thank you. Any help will be much appreciated.

Topic		Replies	Views
GKE ignores readiness probe from pod during high load General Discussions	4	1321	March 24, 2021
Pod End of Life: Still serving after sigterm? General Discussions development	4	2857	December 13, 2022
How do I block a pod termination in a controller? General Discussions development , network	3	742	February 8, 2023
Wait for WebRTC connections to terminate during deployment rollout General Discussions	2	868	January 8, 2021
Application still receives requests after SIGTERM General Discussions network	1	389	December 6, 2023

Client is disconnected when the pod is terminating or fails the readiness probe

Cluster information:

Related topics