Cpu usage between "request" and "limit"

If the cpu request is 500 and the cpu limit is 4000. What happens when the pod (there is only 1 container in the pod) tries to use more than 500 cpu?

By “what happens”, I mean to ask if there is some throttling, etc. involved?

Lets say the node has ample free cpu. Then, will there be any difference in performance between the foll.:
(a) request=500 & limit=4000. Actual cpu usage 2000.
(b) request=3000 & limit=4000. Actual cpu usage 2000.

Basically, the moment actual cpu usage crosses 500, though it will be allowed to use more cpu (upto 4000) but will that be allowed in a step-wise fashion? Will the scheduler play any role?

Thanks

If there is no contention, then there should be no impact.

1 Like

I think limit sets /sys/fs/cgroup/cpu/cpu.cfs_quota_us inside of container and this is hard limit that will not be exceeded.

The request sets /sys/fs/cgroup/cpu/cpu.shares. As long as there is idle CPU on the host it will not make difference but when host that is running those containers will have 100% CPU usage the pod with request 500 will have 6 time less CPU than pod with request 3000.

Thanks @thockin @Adam_Dembek

@thockin Can you please help to clarify one thing. If I set both CPU requests and limits. The role of requests is only for scheduling, right? Since, we have set the limits, it will follow the cpu.cfs_period_us and cpu.cfs_quota_us?
Overall, the CPU requests are of no use when we set limits but just helpful for scheduling?

Correct that the default is request is used for scheduling and limit used for enforcement.
Minor point: If they are not the same, you are more likely to be evicted compared to others (look up kubernetes qos)

Also it is possible to disable enforcement of limits (it’s a kubelet setting iirc). In these cases requests are what matter and cpu shares are set. This allows bursting within constraints. When contended you will get the ratio of your request / total requests on node. We personally prefer this, because the default limit algorithm (I think this was improved in recent kernels) is not very flexible and for thread based languages like Java and C# caused a lot of grief especially during startup. YMMV

@Michael_Bell @itnazeer
Responses are here:

  • The CPU limit defines a hard ceiling on how much CPU time the container can use. During each scheduling interval (time slice), the Linux kernel checks to see if this limit is exceeded; if so, the kernel waits before allowing that cgroup to resume execution.
  • The CPU request typically defines a weighting. If several different containers (cgroups) want to run on a contended system, workloads with larger CPU requests are allocated more CPU time than workloads with small requests.
    Resource Management for Pods and Containers | Kubernetes

and here:

A container might or might not be allowed to exceed its CPU limit for extended periods of time. However, container runtimes don’t terminate Pods or containers for excessive CPU usage.
src:

So node under cpu contention will not evict culprit (high cpu consuming) pods and containers
Resource Management for Pods and Containers | Kubernetes.

Not JUST scheduling - the OS uses requests for CPU scheduling when there is contention. A pod which requests cpu: 1 will get half as much CPU time as a pod with cpu: 2, and those should be approximately 1 CPU-second/second and 2 CPU-seconds/second, respectively.

Absent any contention, CPU can go as high as limits.