Cluster running out off memory

Hi All,
I am facing cluster issue. It’s private cluster. Inside the organisation people create there namespace but don’t delete. So is there any automation possible that the ns will delete after configured hours

I wrote this CronJob as an example. Set the TTL in seconds and it will delete namespaces that have been labeled with auto-delete=true.

You can add the label via:

kubectl label namespace/qwer --overwrite auto-delete=true

Here’s the CronJob:

# the job will be using a service account to run kubectl commands 
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: namespace-cleaner
automountServiceAccountToken: true

# These permissions permit the job to view and delete namespaces
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: namespace-cleaner
rules:
- apiGroups:
    - "*"
  resources:
    - namespaces
  verbs:
    - get
    - list
    - delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: namespace-cleaner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: namespace-cleaner
subjects:
- kind: ServiceAccount
  name: namespace-cleaner
  namespace: default

# These permissions are to get the job's pod `metadata.creationTimestamp`
# as a reference for comparing time against other 
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: namespace-cleaner
rules:
- apiGroups:
    - "*"
  resources:
    - pods
  verbs:
    - get
    - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: namespace-cleaner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: namespace-cleaner
subjects:
- kind: ServiceAccount
  name: namespace-cleaner
  namespace: default

# Job to cleanup old namespaces. When testing this, it can be made
# to run once per a minute by setting the schedule to "*/1 * * * *"
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: namespace-cleaner
spec:
  schedule: "@hourly"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: namespace-cleaner
          automountServiceAccountToken: true
          containers:
          - name: namespace-cleaner
            image: bitnami/kubectl
            imagePullPolicy: IfNotPresent
            env:
            - name: K8S_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: K8S_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: TTL
              value: "86400"
            command:
            - /bin/bash
            - -c
            - |
              set -e # fail on errors
              if [ "${TTL}" == "" ]; then
                echo TTL must be set
                exit 1
              fi

              if [ "${K8S_NAMESPACE}" == "" ]; then
                echo K8S_NAMESPACE not set
                exit 1
              fi

              if [ "${K8S_POD_NAME}" == "" ]; then
                echo K8S_POD_NAME not set
                exit 1
              fi

              CUR_DATESTAMP=$(kubectl -n "${K8S_NAMESPACE}" get pod "${K8S_POD_NAME}" -o go-template --template '{{.metadata.creationTimestamp}}{{"\n"}}')
              CUR_EPOCH=$(jq -n --arg datestamp "${CUR_DATESTAMP}" '$datestamp | fromdate')
              echo "CUR_DATESTAMP=${CUR_DATESTAMP}"

              kubectl get ns --selector="auto-delete=true" -o go-template \
                  --template '{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{"\n"}}{{end}}' \
                  | while read -r line; do
                      NAMESPACE=$(echo $line | cut -f1 -d' ');
                      CREATED_DATESTAMP=$(echo $line | cut -f2 -d' ');
                      CREATED_EPOCH=$(jq -n --arg datestamp "${CREATED_DATESTAMP}" '$datestamp | fromdate')
                      SECONDS_ELAPSED=$((${CUR_EPOCH} - ${CREATED_EPOCH}));
                      echo ${NAMESPACE} has been running ${SECONDS_ELAPSED} seconds
                      if [ ${SECONDS_ELAPSED} -gt ${TTL} ]; then
                        echo ${NAMESPACE} needs to be deleted
                        kubectl delete ns "${NAMESPACE}"
                      fi
                  done
          restartPolicy: Never

Heres the license regarding the bash script embedded in it:

MIT License

Copyright (c) 2021 protosam

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Edit: I had to fix the role permissions, because I forgot to delete my docker-for-desktop-binding role so testing didn’t really work. Also had to switch from /bin/sh to /bin/bash, that was just a mistake.

Edit #2: I really didn’t like creating a namespace to compare it’s creationTime, so I added permissions so the pod can view it’s own creationTime for time comparisons. Also discovered you can’t use metadata.creationTime nor status.startTime in env[].valueFrom.fieldRef.fieldPath.