Help troubleshooting scenario

Leonardo-Ferreira · April 30, 2020, 5:23pm

I have a 9 node cluster (3 ETCD, 2 Marters, 4 Compute). The computes nodes are 16-core, 32gb ram, on linux RHEL 7.5(3.10) and their are mostly vacant (at the time of this writing, the total CPU usage reported is close to 6 cores and memory is 105Gi)

The application running is a .Net Core 2.0 web api. The image is based on Debian 10. The web api is very simple and basically only queries for information at a database and another internal WCF service (not hosted on Kube). The database queries are very simple and doesn’t even have a “join” involved are all index-optimised.

The pod deployment request 100m of cpu and 100Mi of memory, going up to 1cpu and 1Gi

The scenario I need help with is:

This morning we were peaking around 8000 requests per min (133 per sec).
We had 10 pods running the app, spreaded as evenly as possible.
The pods were reporting CPU usage around 10-25% and 300-450Mi of memory
The response time was absolute garbage. We’re talking 50sec response time.

The database guys said that everything was very smooth on their side and out monitoring on the WCF service showed that they were also very well (responding around 50 and 300ms). Basically our requests were stuck inside the pods, not waiting for some kind of network IO.

There was no indications of CPU pressure or Memory pressure, but we decided to double the amount of pods, to 20 pods. And the problem went away.

Can anyone explain this behaviour?

Cluster information:

Kubernetes version: 1.8.3
Cloud being used: DXC - but IaaS
Installation method: don’t know
Host OS: RHEL 7.5 (3.10.0)
CNI and version:
CRI and version:

mrbobbytables · April 30, 2020, 7:34pm

Before anything – is this the actual version you’re running? or 1.18.3. 1.8.3 has been out of support and hasn’t been touched in a few years now

Leonardo-Ferreira · April 30, 2020, 7:36pm

sadly yes… 1.8 is the actual version… well thanks anyway

Topic		Replies	Views
Cpu kubernetes bugs in google cloud General Discussions	1	523	October 9, 2019
Performance of deployment stays the same regardless of the number of pods General Discussions	5	1441	April 14, 2019
Imbalance of pods in Kubernetes General Discussions	1	1145	October 27, 2019
Few worker nodes vs many worker nodes General Discussions	0	522	May 7, 2020
Private on-premise Kubernetes cluster requirements for local app hosting General Discussions	0	426	March 3, 2021

Help troubleshooting scenario

Cluster information:

Related topics