Force/Faster retry of "scheduling failed"

Backwasch · November 30, 2024, 6:26pm

Hey :=),
I am working on a custom scheduler which kinda works like a batch scheduler, however i have a small problem.
E.g. if i want to do the following (oversimplified):
Schedule Pod A first but keep it waiting (via permit phase), now schedule Pod B and if (for whatever reasons) none of the nodes satisfies our requirements redo the scheduling process for Pod A but filter out the node which has been chosen before for Pod A.

When I use the following in the permit phase to accomplish this:

return framework.NewStatus(framework.Unschedulable, “Test case 2”) (to discard the current Pod) or
waitingPod.Reject(ss.Name(), “Test case 4”) (to discard a previous Pod)
depending on what we want to accomplish.

The selected pods get marked as “scheduling failed”, but it remains in the “Pending” state. However, the time until the scheduler retries to schedule them again is inconsistent. Sometimes it instantly retries to schedule the pods (which is what i want) and sometimes nothing happens for up to 5–6 minutes.
So basically my question is, what causes this behavior? Specifically, what determines when the scheduler retries scheduling if the scheduling of a pod fails? And is it possible to reduce the retry time? Can this be achieved by tags in the deployment file, configuration changes, or manually moving the pods into the activeQ?

Maybe somebody is able to help me

Topic		Replies	Views
Does kubernetes try to schedule pods endlessly along with scheduling failures? General Discussions architecture	0	180	April 17, 2024
Pod is in pending state General Discussions	0	679	September 26, 2021
How do you guys debug FailedScheduling? General Discussions	0	174	February 13, 2025
Is there a way to keep Succeeded pod for longer even after the node is recycled General Discussions	0	105	May 22, 2025
Pod pending state General Discussions podcast	9	3748	November 17, 2021

Force/Faster retry of "scheduling failed"

Related topics