Unknown high memory usage in pod - "Ghost in the Shell"

Hi there,

I’d appreciate any help out there with getting to the bottom of an issue we are seeing
with regard to high memory usage in a pod.

We have a client using our product in a kubernetes cluster. They are reporting
high memory consumption of our product - well above what it would normally consume.
From all our investigations of our product is not using more than the limit set of
2.5GB but the pod shows memory usage of 8 to 9 GB. Every few days they restart
the pod to avoid any out of memory crashes. The have since upped the requests of
memory in the pod to 4Gi and the limits to 8Gi

They notice a spike of 3-4GB memory every day around the same time. The memory then
goes back down but it never goes down to what it had been so there’s a net increase
in memory usage over time. Is this common in kubernetes that memory isn’t released?

Here are the stats from the pod

Thanks


[root@pod-123 /]# cat /proc/meminfo
MemTotal: 32939968 kB
MemFree: 2807660 kB
MemAvailable: 21599668 kB
Buffers: 549132 kB
Cached: 18518680 kB
SwapCached: 311768 kB
Active: 21082336 kB
Inactive: 6960584 kB
Active(anon): 8787860 kB
Inactive(anon): 1644464 kB
Active(file): 12294476 kB
Inactive(file): 5316120 kB
Unevictable: 85880 kB
Mlocked: 85880 kB
SwapTotal: 32939004 kB
SwapFree: 31856892 kB
Dirty: 6732 kB
Writeback: 0 kB
AnonPages: 8993012 kB
Mapped: 516888 kB
Shmem: 1707032 kB
Slab: 1787564 kB
SReclaimable: 1637752 kB
SUnreclaim: 149812 kB
KernelStack: 23120 kB
PageTables: 38916 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 49408988 kB
Committed_AS: 24505240 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 12288 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 188288 kB
DirectMap2M: 12394496 kB
DirectMap1G: 23068672 kB


[root@pod-123 /]# free -m
total used free shared buff/cache available
Mem: 32167 9205 2736 1667 20225 21093
Swap: 32166 1056 31110


[root@pod-123 /]# vmstat -s
32939968 K total memory
9426368 K used memory
21081792 K active memory
6970424 K inactive memory
2798132 K free memory
549132 K buffer memory
20166336 K swap cache
32939004 K total swap
1082112 K used swap
31856892 K free swap
473083354 non-nice user cpu ticks
32794 nice user cpu ticks
211150452 system cpu ticks
5450063383 idle cpu ticks
76525885 IO-wait cpu ticks
0 IRQ cpu ticks
39537536 softirq cpu ticks
0 stolen cpu ticks
987975329 pages paged in
1734773564 pages paged out
10539909 pages swapped in
591792 pages swapped out
3497313353 interrupts
1139477112 CPU context switches
1649005809 boot time
68711829 forks


[root@pod-123 /]# top -n 1
top - 15:01:44 up 91 days, 13:51, 0 users, load average: 2.68, 2.90, 3.14
Tasks: 4 total, 1 running, 3 sleeping, 0 stopped, 0 zombie
%Cpu(s): 17.5 us, 6.7 sy, 0.0 ni, 74.2 id, 0.8 wa, 0.0 hi, 0.8 si, 0.0 st
KiB Mem : 32939968 total, 2801332 free, 9425780 used, 20712856 buff/cache
KiB Swap: 32939004 total, 31856892 free, 1082112 used. 21600456 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 abc 10 -10 8541288 2.5g 40580 S 0.0 8.0 1557:39 pod-process
24931 abc 10 -10 11836 2640 2268 S 0.0 0.0 0:00.02 sh
31171 root 10 -10 11840 2924 2492 S 0.0 0.0 0:00.02 bash
31207 root 10 -10 56196 3716 3176 R 0.0 0.0 0:00.00 top


Kubernetes version:
Cloud being used: (put bare-metal if not on a public cloud)
Installation method:
Host OS:
CNI and version:
CRI and version:

You can try turing off the function in node of your pod. It may works.