How can I have a pod delete itself on failure

kevb · November 16, 2019, 7:20pm

I have a deployment that ensures there are x number of pods running. These pods use emptyDir for their working files, but the data is not needed. When my pod fails/restarts (OOM or similar), since the pod tries to restart on the same node, it still has the same data available to it in its emptyDir. In this case, that causes a number of problems for me, since the application behaves differently when it sees it has existing data (attempts repair, which is not wanted here), and also I have other systems trying to reconnect to the same pod.
IF there is a way for me to tell the pod to just delete on failure instead of restart on failure, the deployment will spin up a new pod and my issue with pods trying to repair from their partial storage and some other headaches just go away, so I’m hoping there is a way for me to have pods delete on failure.

Cluster information:
Kubernetes version: 1.14
Cloud being used: aws

Thanks!

miker256 · November 17, 2019, 12:58am

I would take a look at this.
https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/

rata · November 17, 2019, 1:06am

kevb
    November 16
I have a deployment that ensures there are x number of pods running. These pods use emptyDir for their working files, but the data is not needed. When my pod fails/restarts (OOM or similar), since the pod tries to restart on the same node, it still has the same data available to it in its emptyDir. In this

No, it really shouldn’t. If the pod crashes, the emptyDir data should be lost, no matter if it’s ok the same node or not. Are you sure it is there when it crashes?

Can you please check that again?

case, that causes a number of problems for me, since the application behaves differently when it sees it has existing data (attempts repair, which is not wanted here), and also I have other systems trying to reconnect to the same pod.

IF there is a way for me to tell the pod to just delete on failure instead of restart on failure, the deployment will spin up a new pod and my issue with pods trying to repair from their partial storage and some other headaches just go away, so I’m hoping there is a way for me to have pods delete on failure.

I think the previous assumption (data is there) might be false, and therefore the problem completely different.

Please check that again, to see if the analysis might be completely different

kevb · November 17, 2019, 5:27am

Hi rata. Yes, I’m sure the data is still there in the emptydir after a crash. The Kubernetes documentation on how the emptydir works explicitly verifies that it is intended behavior.

From Volumes - Kubernetes

Note: A Container crashing does NOT remove a Pod from a node, so the data in an emptyDir volume is safe across Container crashes.

kevb · November 17, 2019, 5:34am

hi miker256,
Which part of that are you thinking is a good fit here? I was hoping for a simple way of having the pod just delete on failure. Is there a way for me to automatically trigger that from what you linked to, or were you saying that I might be able to write something to watch the pod for a failure and then trigger the eviction api if a pod fails? I think that’d end up being the same as just calling a delete on the pod. I was hoping to avoid writing a new service to watch the pod for a failure and trigger a delete, and was rather thinking there might be a way to have a pod delete on failure so an entirely new pod starts up using native k8s capabilities/configurations.

thockin · November 17, 2019, 5:43am

This is right - data is preserved unless a pod is failed by the Kubelet

kevb · November 17, 2019, 6:13am

Thanks for re-confirming that thockin.
I guess my question boils down to if anyone knows of a way I can tell k8s to NOT restart the pods in this deployment, but rather delete it if it fails for any reason? I’m trying to avoid writing a service to watch these pods and delete them when they fail, so was hoping there was an option/configuration/trick to get pods to delete on failure.

malagant · November 17, 2019, 7:06am

Just one idea…
If I unterstand you right you could live with a restarting pod as long as there is an emptydir that‘s in fact empty?
You could use an init container that deletes the content before spinning up the new pod.

thockin · November 17, 2019, 7:11am

Historically Deployments required that pods be restart-always. But I guess I don’t see why (logically) they have to be. restart-never is semantically consistent, if a little odd. But that’s not what is implemented right now.

rata · November 17, 2019, 6:10pm

Sorry, my bad! :-/

rata · November 17, 2019, 6:10pm

Thanks, sorry again!

thockin · November 17, 2019, 6:29pm

The system is big enough that we all lose track of some details. Thanks for being such a great question-answerer!

kevb · November 18, 2019, 1:45pm

Hey malagant.
I tried the init container route, but that seems to only fire when the pod inits, and not when a single container OOMs and restarts. I then looked at using a lifecycle hook to flush the data dir which seemed not to be as reliable with when it executes (which the docs do mention they don’t guarantee exactly when they execute), which improves the situation, but I still see some random issues because of it. Being able to have the pod delete on error should make this behave consistently, and gets me past a secondary issue of some clients which handle keepalives and active connections oddly and stick with my unhealthy pod. So a very valid suggestion, but one that seems not to fully fix this particular issue.

malagant · November 18, 2019, 2:02pm

Ok, I see. But then my last bet is that you change the command attribute when starting the pod into something like this:

rm -fr /content && actual_container_command

This will definitely delete the content dir of your pod before it starts. This might be the solution but only because you don’t need the content existing before each start.

yww · January 21, 2021, 6:09am

I’m having an almost exactly same problem.
Did you manage to find a solution for this ?

thockin · January 21, 2021, 4:39pm

As far as I know, this has not fundamentally changed. If we want to push something, we’ll need a github bug to discuss.

philgebhardt · October 30, 2024, 6:56pm

I suspect that a restartPolicy which allowed a maximum number of restarts would satisfy this problem, assuming a pod is replaced via its deployment/HPA configuration once it has exited.

This isn’t supported today but there is an ongoing enhancement: https://github.com/kubernetes/enhancements/pull/3339

Topic		Replies	Views
Can k8s redeploy the pod when container CrashLoopBackOff error contine? General Discussions development	1	121	November 25, 2024
Pod recreate General Discussions	9	4296	January 13, 2020
How to properly delete failed Pods from Deployment without affecting running Pods? General Discussions	1	6554	April 19, 2019
Deployment die General Discussions	3	535	November 5, 2022
Know whether pod deletion is triggered by scaling down General Discussions	0	623	July 3, 2019

How can I have a pod delete itself on failure

Related topics