Hello, I have been searching around but have not found a good answer, so raise the question here, hope to get some professional help…
I have an accelerator application running on a bare-metal machine which consists of one server and at least one client–they exchanged the information through linux IPC (share memory and message queue) on bare-metal.
I am now trying to make all of them into one POD and manage by a K8S deployment (three Docker containers inside, one for server two for clients, to make the IPC work a hostIPC: true tag is in yaml) .
But when running the test I meet two issues:
if I delete this POD and then later on another one will be created by deployment-- there are probably some “garbage” still left in server container (for exp, some file descriptor of share memory left in /dev/shm folder, are not been deleted yet in the previous POD, will keep existing in this POD)… This surprised me as I thought each time a POD is created it will create the container from the docker image–but this looks not true and it likely “inheriate the container” from the previous POD?
Is there any way to prevent that (I mean to have a real new container start up each time)?
I find that as soon as the application running for some time, my application’s performance (mostly a computing throughput ) will be reduced gradually and after one over night test it drops to something around 20% of beginning… What could cause such performance degradation–memory/cpu/storage ? I am not sure if IPC is also an bottleneck?
My Cluster information:
Kubernetes version: Client v1.17.1 + Server v1.19.4
Cloud being used: no, bare-metal machine
Installation method: apt-get install
Host OS: ubuntu18.04
CNI and version: Flannel v0.13.0
CRI and version: docker 19.03
below is the yaml I am using.
--- apiVersion: apps/v1 kind: Deployment metadata: name: test-cloud-dp spec: replicas: 1 selector: matchLabels: app: test-cloud-pod template: metadata: labels: app: test-cloud-pod spec: hostIPC: true dnsPolicy: ClusterFirst containers: - name: test-server image: test-server:0.1 imagePullPolicy: IfNotPresent command: ["/bin/bash"] args: ["-c", "./run.sh"] - name: test-client-a image: test-client:0.1 imagePullPolicy: IfNotPresent command: ["/bin/bash"] args: ["-c", "while true; do echo ==client ready==; sleep 20; done;"] ports: - name: client-port containerPort: 8010 - name: test-client-b image: test-client:0.1 imagePullPolicy: IfNotPresent command: ["/bin/bash"] args: ["-c", "while true; do echo == client ready==; sleep 20; done;"] ports: - name: client-port containerPort: 8010