How does K8S handle scheduling and parellelism with job submissions

jerryZ · October 20, 2020, 11:31pm

Something I’m trying to figure out is how K8S handles node allocation and scheduling / what component is responsible for ensuring the desired resources. I’m porting an existing mpi job submission system to K8S. With these MPI jobs, all the nodes have to be present before executing. One particular thing I was testing is to see what K8S does when I request 8 but only have 5 nodes available. What I observed with my KIND setup up is whether I used parallelism with a job manifest OR replicas with mpi-operator manifest; it would run 5 and then sequentially do the remaining 3. In these cases I didn’t call MPI, I just wanted to see how K8S would scheduled them.

Am I wrong in expecting K8S to just hold the submission until the desired number of nodes were available or fail the submission? Is this out of the scope of K8S responsibility and a service needs to handle scheduling? Considering MPI needs all nodes up at the same time does this mean I found a bug in mpi-operator?

Topic		Replies	Views
Kubernetes Pod Scheduling Mechanishm General Discussions	1	267	October 7, 2024
How does scheduler decides which pod to run on which node? General Discussions	1	664	June 4, 2021
Distributing pods based on Node performance General Discussions	14	5883	September 8, 2022
General question about kubernetes General Discussions	0	477	January 31, 2021
Kubernetes Master Worker Node issue General Discussions kubernetes-custom-resources	2	1547	August 30, 2022

How does K8S handle scheduling and parellelism with job submissions

Related topics