Cluster information:
Kubernetes version: v1.27.13
Cloud being used: AWS
Installation method: kOps
Host OS: Ubuntu 22.04.2 LTS
CNI and version: Calico v3.25.2
CRI and version: Contained v1.7.15
After enabling kubelet soft eviction for memory available, the evicted pods are terminating with ContainerStatusUnknown.
My kubelet configuration:
–system-reserved=cpu=200m,ephemeral-storage=1Gi,memory=300Mi
–kube-reserved=cpu=200m,ephemeral-storage=1Gi,memory=300Mi
–eviction-soft-grace-period=memory.available=0s
–eviction-soft=memory.available<7%
–eviction-max-pod-grace-period=30
–eviction-hard=memory.available<2%,nodefs.available<10%,nodefs.inodesFree<5%,imagefs.available<10%,imagefs.inodesFree<5%
Pod message:
Last State: Terminated
Reason: ContainerStatusUnknown
Message: The container could not be located when the pod was deleted. The container used to be Running
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Mon, 01 Jan 0001 00:00:00 +0000
kubelet logs:
Jun 14 00:42:50 ip-10-53-122-22 kubelet[4670]: I0614 00:42:50.830185 4670 eviction_manager.go:346] “Eviction manager: attempting to reclaim” resourceName=“memory”
Jun 14 00:42:50 ip-10-53-122-22 kubelet[4670]: I0614 00:42:50.830223 4670 eviction_manager.go:357] “Eviction manager: must evict pod(s) to reclaim” resourceName=“memory”
Jun 14 00:43:05 ip-10-53-122-22 kubelet[4670]: I0614 00:43:05.248884 4670 eviction_manager.go:375] “Eviction manager: pods ranked for eviction” pods=[velero/memory-test-5b55b8d6c7-mpfl5 kube-system/kube-proxy-i-0bca6c04994e8a239 groceries-order/groceries-edit-manager-approve-timeout-854bdddcf-6bmjm kube-system/calico-node-f97xk kube-system/ebs-csi-node-vwqrd kube-system/node-local-dns-w8rvv machinery/k8s-node-labeler-f64tl groceries-retail/picking-notifier-api-778678d94d-n7fx2 groceries-retail/picking-notifier-consumer-749c54f859-ctqzn shop-app-bff/shop-app-bff-78546c8479-v44jd security/sec-groceries-api-f9f9c86d5-ppnpt groceries-catalog/api-catalog-normalization-data-enricher-5f9d87bdf7-zhczz groceries-catalog/service-catalog-groceries-sync-back-8c699d994-wjgs8 groceries-catalog/worker-catalog-product-normalize-sync-back-66fb9d8579-48mfv groceries-catalog-product-association/product-suggestion-association-worker-57987586b5-vbr5p security/poc-pipeline-6c7f48c87-k5ghm groceries-catalog-product-ingestion/catalog-product-change-price-lp-sync-worker-6bddbdf7b7-gmg2p groceries-catalog/worker-rupture-probability-enricher-priority-5bc4f9dc5b-2hrnn groceries-catalog-product-ingestion/catalog-dispatcher-stock-changed-for-mktplace-worker-676b5c659t groceries-order/groceries-order-search-api-6644fb5945-vzvsx ecommerce/bot-assistant-csat-worker-7bccf8f5bc-c6h62 catalog-min-stock-ingestor/catalog-min-stock-ingestor-7dbd65cbb6-8wwk6 monitoring/prometheus-operator-prometheus-node-exporter-kxlxk machinery/node-problem-detector-hcjxz machinery/consul-consul-ngfcr machinery/datadog-p9cb5 machinery/kiam-agent-c2zzv]
Jun 14 00:43:05 ip-10-53-122-22 kubelet[4670]: I0614 00:43:05.249208 4670 kuberuntime_container.go:742] “Killing container with a grace period” pod=“velero/memory-test-5b55b8d6c7-mpfl5” podUID=d34cd5a8-0dd2-4f66-919f-42892e282269 containerName=“memory-test” containerID=“containerd://fb18555d8bead1999cb7261f0b337d4ebc54a40b816a8768bcaa11b018ef09d0” gracePeriod=60
Jun 14 00:43:07 ip-10-53-122-22 kubelet[4670]: I0614 00:43:07.325937 4670 eviction_manager.go:596] “Eviction manager: pod is evicted successfully” pod=“velero/memory-test-5b55b8d6c7-mpfl5”
Jun 14 00:43:07 ip-10-53-122-22 kubelet[4670]: I0614 00:43:07.325966 4670 eviction_manager.go:205] “Eviction manager: pods evicted, waiting for pod to be cleaned up” pods=[velero/memory-test-5b55b8d6c7-mpfl5]
Jun 14 00:43:08 ip-10-53-122-22 kubelet[4670]: I0614 00:43:08.326520 4670 eviction_manager.go:427] “Eviction manager: pods successfully cleaned up” pods=[velero/memory-test-5b55b8d6c7-mpfl5]
root@ip-10-53-122-22:/home/ubuntu# journalctl | grep eviction_manager.go
The cluster runs instances m5ad.2xlarge (32GiB).
Can anyone help me?