Node memory usage incorrect (?)

cyrus-mc · June 28, 2022, 10:48pm

Cluster information:

Kubernetes version: 1.21
Cloud being used: AWS
Installation method:
Host OS: Bottlerocket
CRI and version: containerd 1.5.11

Investigating memory usage (working set bytes) as reported through both Prometheus.

To get the full working set bytes of all applications/processes/etc on the system a simple query of

container_memory_working_set_bytes{id=“/”}

The value returned here matches pretty closely to kubectl top nodes. So so far all good. However the value returned seemed awfully high for what I expected on the node (value was about 6G).

To further dig into this I then calculated the working set bytes for just the POD(s)

sum(container_memory_working_set_bytes{pod!=“”})

In my case this returned about 2.5G. The difference of 3.5 gig between total working set and POD working set seemed high. That would mean that the OS components, kubelet + runtime (containerd in my case) were using 3.5G of memory. Almost a quarter of the total machine memory (16G instance).

Continued further down the rabbit hole and looked at the cgroup (v2) stats for the containerd.slice and found it to be using a working set (calculated using memory.current - inactive_file from memory.stat as that is the calculation that cadvisor/runc uses) around 1.7G in which the majority of it was file backed memory (active). That explains about half of the 3.5g above (not sure why containerd is using that much memory as it really shouldn’t be) but left another 1.7g unaccounted for.

Sticking to cgroup exported stats I calculated the working set bytes for the root cgroup (calculated using /proc/meminfo - inactive_file from memory.stat) and found it to be in line with what I was seeing above (so roughly 6G). So that explains where Kubernetes is getting that value. However in calculating the child cgroups/slices to see where all the memory was being used I still ended up with around 1.7G unaccounted for.

Therefore I am left with the question as too where this memory is being consumed. It is quite important to understand as this impacts Kubernetes and when OOM/eviction thresholds are met.

Alexandru_Lazarev · December 13, 2024, 12:11pm

So maybe those 1.7G is used by system (kernel and related objects like slabs, etc.) + some system and non-containerized applications, i.e. processes running direct on host?
I have similar question and trying understand it.

Topic		Replies	Views
K8s eats my RAM?! General Discussions	1	758	April 3, 2022
Data discrepancy in current memory usage metrics between kubectl top command and metrics exposed by kubelet(scraped by Prometheus) General Discussions prometheus	2	2684	December 13, 2024
Kubelet report 0 memory usage General Discussions	1	2031	May 26, 2021
Kubelet unable to get memory usage (cgroups issue?) microk8s	0	993	February 23, 2024
Node get unstable with high memory usage General Discussions	0	2984	December 22, 2020

Node memory usage incorrect (?)

Cluster information:

Related topics