Typically, we use a container liveness prober to monitor container within a pod. If the prober returns a failure, kubectl restarts the container not the pod. If the container continues to have problems, it will enter the CrashLoopBackOff state. Even in this state, the container continues to retry, but the Pod is normal.
If a container problems occurs, can I terminate the Pod itself and force it to be redistributed to another node?
The goal is to give unhealthy container one more high availability opportunity to run on another node automatically before administrator intervention.
I think it would be possible by developing operator, but I’m also curious if there’s already a feature like this.