Job CPU Limit Delays Pulling Image

matthiasvegh · August 2, 2022, 8:02am

If I set a low CPU limit for a job container, I notice that the delay from the Pod being scheduled to the Kubelet starting to pull the image increases drastically.

What is Kubelet doing between being scheduled and pulling, and why is that impacted by the CPU limit? I expect the CPU limit to only apply to the container itself, not whatever is being done before pulling.
The specific delay and the curve described depends on the cluster, but is present on all clusters I’ve tested on.

kubectl describe pod shows the delay: 10s-4s=6s

Events:
  Type     Reason       Age              From               Message
  ----     ------       ----             ----               -------
  Normal   Scheduled    10s              default-scheduler  Successfully assigned jobtest/jobtest50mcpu--1-5jx68 to node-10-63-135-34
  Normal   Pulling      4s               kubelet            Pulling image "busybox:latest"
  Normal   Pulled       3s               kubelet            Successfully pulled image "busybox:latest" in 940.736382ms
  Normal   Created      3s               kubelet            Created container test
  Normal   Started      3s               kubelet            Started container test

job.yaml for testing:

apiVersion: batch/v1
kind: Job
metadata:
  name: jobtest50mcpu
spec:
  template:
    spec:
      containers:
        - name: test
          image: busybox:latest
          args:
            - "/bin/true"
          imagePullPolicy: Always
          resources:
            limits:
              cpu: "50m"
      restartPolicy: Never
  backoffLimit: 0
---

Cluster information:

Kubernetes version: Multiple (1.22, 1.23, 1.24)
Cloud being used: bare-metal
Installation method: Multiple (On-prem, Minikube)
Host OS: Linux 5.3.18-57 (SLES 15 SP3)
CRI and version: containerd 1.4.12

Topic		Replies	Views
Kubernetes - limiting count and frequency of image pulls cluster wide General Discussions	4	2580	August 30, 2022
How to limit amount of time spent on ImagePullBackOff General Discussions	2	7650	August 31, 2021
Deployment General Discussions	25	4376	September 2, 2019
Deployment with image "busybox" always fails (but deos work with any others) General Discussions	1	398	June 15, 2024
Cpu usage between "request" and "limit" General Discussions	7	1684	December 19, 2024

Job CPU Limit Delays Pulling Image

Cluster information:

Related topics