Interest in a scheduling algorithm to energy and cost optimize AI tasks?

zaidaf · September 16, 2025, 11:44pm

Gauging interest here!

Most existing Kubernetes schedulers (default, Volcano, YuniKorn, Kueue, etc.) are still largely hardware-agnostic. This creates inefficiencies when running AI/ML workloads on specialized accelerators like GPUs, TPUs, Trainium, or Inferentia. The result: resource contention, GPU fragmentation, and unnecessary infrastructure costs.

I’m working on a new scheduler that will:

Match jobs to hardware based on actual requirements (GPU memory, compute power, etc.).
Support multi-job sharing on the same accelerator to improve throughput.
Enable adaptive prioritization and preemption policies.
Incorporate cloud pricing models for cost-aware scheduling (spot vs on-demand).

The plan is to release this as an open-source library and contribute it back to the K8s community, with active engagement at KubeCon and beyond. The goal is to maximize accelerator efficiency while reducing costs, creating real impact for AI/ML workloads at scale.

Would love to hear thoughts from the community—what pain points do you see today with GPU/accelerator scheduling?

Topic		Replies	Views
Disscussion: Implement GPU pooling stand-alone with k8s General Discussions	0	574	July 13, 2023
What is the relationship between HPA/VPA and K8s-scheduler? General Discussions	1	2818	March 18, 2023
Kubernetes.io Blog: Poseidon-Firmament Scheduler – Flow Network Graph Based Scheduler General Discussions k8s-blog	0	791	February 6, 2019
[KCDSpain2021] Cómo desplegar redes neuronales en kubernetes con HW cuda Spanish kcdspain	3	965	June 11, 2021
Run GPU-accelerated Kubernetes workloads at scale on a developer cloud General Discussions	0	67	June 24, 2024

Interest in a scheduling algorithm to energy and cost optimize AI tasks?

Related topics