I have a deployment that ensures there are x number of pods running. These pods use emptyDir for their working files, but the data is not needed. When my pod fails/restarts (OOM or similar), since the pod tries to restart on the same node, it still has the same data available to it in its emptyDir. In this case, that causes a number of problems for me, since the application behaves differently when it sees it has existing data (attempts repair, which is not wanted here), and also I have other systems trying to reconnect to the same pod.
IF there is a way for me to tell the pod to just delete on failure instead of restart on failure, the deployment will spin up a new pod and my issue with pods trying to repair from their partial storage and some other headaches just go away, so Iām hoping there is a way for me to have pods delete on failure.
Cluster information:
Kubernetes version: 1.14
Cloud being used: aws
I have a deployment that ensures there are x number of pods running. These pods use emptyDir for their working files, but the data is not needed. When my pod fails/restarts (OOM or similar), since the pod tries to restart on the same node, it still has the same data available to it in its emptyDir. In this
No, it really shouldnāt. If the pod crashes, the emptyDir data should be lost, no matter if itās ok the same node or not. Are you sure it is there when it crashes?
Can you please check that again?
case, that causes a number of problems for me, since the application behaves differently when it sees it has existing data (attempts repair, which is not wanted here), and also I have other systems trying to reconnect to the same pod.
IF there is a way for me to tell the pod to just delete on failure instead of restart on failure, the deployment will spin up a new pod and my issue with pods trying to repair from their partial storage and some other headaches just go away, so Iām hoping there is a way for me to have pods delete on failure.
I think the previous assumption (data is there) might be false, and therefore the problem completely different.
Please check that again, to see if the analysis might be completely different
Hi rata. Yes, Iām sure the data is still there in the emptydir after a crash. The Kubernetes documentation on how the emptydir works explicitly verifies that it is intended behavior.
hi miker256,
Which part of that are you thinking is a good fit here? I was hoping for a simple way of having the pod just delete on failure. Is there a way for me to automatically trigger that from what you linked to, or were you saying that I might be able to write something to watch the pod for a failure and then trigger the eviction api if a pod fails? I think thatād end up being the same as just calling a delete on the pod. I was hoping to avoid writing a new service to watch the pod for a failure and trigger a delete, and was rather thinking there might be a way to have a pod delete on failure so an entirely new pod starts up using native k8s capabilities/configurations.
Thanks for re-confirming that thockin.
I guess my question boils down to if anyone knows of a way I can tell k8s to NOT restart the pods in this deployment, but rather delete it if it fails for any reason? Iām trying to avoid writing a service to watch these pods and delete them when they fail, so was hoping there was an option/configuration/trick to get pods to delete on failure.
Just one ideaā¦
If I unterstand you right you could live with a restarting pod as long as there is an emptydir thatās in fact empty?
You could use an init container that deletes the content before spinning up the new pod.
Historically Deployments required that pods be restart-always. But I guess I donāt see why (logically) they have to be. restart-never is semantically consistent, if a little odd. But thatās not what is implemented right now.
Hey malagant.
I tried the init container route, but that seems to only fire when the pod inits, and not when a single container OOMs and restarts. I then looked at using a lifecycle hook to flush the data dir which seemed not to be as reliable with when it executes (which the docs do mention they donāt guarantee exactly when they execute), which improves the situation, but I still see some random issues because of it. Being able to have the pod delete on error should make this behave consistently, and gets me past a secondary issue of some clients which handle keepalives and active connections oddly and stick with my unhealthy pod. So a very valid suggestion, but one that seems not to fully fix this particular issue.
Ok, I see. But then my last bet is that you change the command attribute when starting the pod into something like this:
rm -fr /content && actual_container_command
This will definitely delete the content dir of your pod before it starts. This might be the solution but only because you donāt need the content existing before each start.
I suspect that a restartPolicy which allowed a maximum number of restarts would satisfy this problem, assuming a pod is replaced via its deployment/HPA configuration once it has exited.