Hello, I wonder how far a grace period could be extended to.
Let’s say I have a process that runs for 1 hour every time it is called, and can’t be interrupted/resumed. Ideally, I would like to assign a grace period of 1 hour to it, to ensure it will be able to complete even if the pod is requested to be evacuated. However, this does not look like a good practice, considering the grace period has to be provided in seconds (and since Cloud processes should be as transient as possible).
I also wonder the potential issues it could cause on the whole logic, for example on Preemption that would need to wait for one hour before actually scaling down the pod, and therefore schedule a pod with higher priority. To match the recommendation from pod-priority-preemption page, I guess such a pod should have the highest priority anyway. But would it be enough to guarantee that Kubernetes/the kubelet won’t try to evict the pod for some reason? (as referenced in scheduling-eviction/node-pressure-eviction page, for example)
In other words, is having a big grace period the best/a reliable way to ensure such process can’t end in a situation where it never ends (because of the worst case scenario where the pod is evicted at a higher frequency than the process duration allows), or are there smarter ways to have such guarantee? (apart from adapting the process, of course)
Thanks for your help and expertise!