I have a question about the new-ish Job suspension feature, as outlined in KEP 2232.
In particular, I am interested in being able to suspend long running jobs and reenable in round-robin way, eg. if 100 jobs are running, but only 10 run at a time, in fixed intervals.
It looks like this is very close to one of the use cases mentioned: enhancements/keps/sig-apps/2232-suspend-jobs at master · kubernetes/enhancements · GitHub eg:
I can write a higher-level Job queueing controller to do this based on external factors. For example, the controller could choose to simply unpause Jobs in the FIFO order.
I there were any examples of such ‘higher-level Job queuing controllers’, or how one would go about creating something like this. Would it need to be a new K8s primitive that could do that? Does anyone know if something like this is being planned?
Looking for any pointeres/suggestions on where to find more info on how to try to do something like this, either with Job suspend or other approaches.