Pod memory overflow

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version: v1.26.5
Host OS: centos8
CNI and version: cni-plugins-linux-arm64-v1.3.0
containerd version: containerd GitHub - containerd/containerd: An open and reliable container runtime v1.7.1 1677a17964311325ed1c31e2c0a3589ce6d5c30d

Question:

I allocated 64Gi of memory to a Deployment and started an algorithm process in it. After a period of time (about an hour), it was found that the process occupied a large amount of memory 400Gi, causing the host to crash OOM.

When the Pod is running, you cannot use the ps -aux command or the top command to check that the memory usage of the process has become high. However, you can use free -h to find that the memory usage increases until it exceeds the Pod memory limit and causes the system to crash due to OOM.

The configuration file is as follows:

        command: ["sh", "-c"]
        args:
          - cd /home/** && python main.py --mode stream --videostream-path data/test.mp4 --weights data/model/yolov5l_310P3.om
        resources:
          limits:
            cpu: "16"
            memory: 64Gi
          requests:
            cpu: "8"
            memory: 32Gi

Since I can only upload one picture, I stitched together several pictures and put them in together.

Hi @rlws1600510083
I’m just currious if You found root cause of your issue?

I don’t understand how does process consumes 400GB if You settled limit of 64GB?

Can You see your process MEM (RES) usage with top command on host?
Is Your app working actively with files inside a container?