Asking for help? Comment out what you need so we can get more information to help you!
Cluster information:
Kubernetes version: 1.33.1
Cloud being used: bare metal
Installation method: Helmfile
Host OS: SUSE Linux
CNI and version:Calico 3.29.3
CRI and version: Containerd
Application: NoSQL database
Workload:
- Client generating 50% reads and 50% writes to the database
- Constant amount of operations per second during 1 hour
- Both setups are able to run the same workloads with slighty over 50% of the requested CPU
- CPU intensive
- Latency sensitive
Compared Setups:
Setup 1:
- 4 Database containers
- QoS Guaranteed
- 12 requested CPU core
- 12 limit CPU core
- CPU policy for pinning
Setup 2:
- 4 Database containers
- QoS Burstable
- 12 requested CPU core
- 24 limit CPU core
- No CPU policy (CPU pinning)
General Remarks:
Setup 2 performs better with the same workload:
- Setup 2 runs it with less CPU (container_cpu_usage_seconds_total)
- Setup 2 runs it with lower latencies (both in median/average and in 99pct/max)
- Garbage collection in the JVM that runs the database is statistically comparable among setups
Questions:
Initially, our expectation was that both setups would perform similarly with this manageable workload (I was facing as experimental control).
In internal discussions, we found strange and even counter-intuitive that, even under the requested CPU per container, the different setups were so different, favoring Setup 2. Discussing with another fella, he argues that it’s expected result, since we didn’t manage to agree, I bring the issue to you fellas.
Is this expected behaviour, and why? We would expect these two settings to behave the same during 6 cpu usage scenario in terms of cpu usage for given tps