Node-Problem-Detector in Kubernetes Cluster

Hello everyone,
I’m trying to set up node-problem detector (npd) in my cluster which would send logs about the nodes status and the pod status. I’m following this article: Monitor Node Health | Kubernetes. And since it mentions creating a ConfigMap, I have used the config folder from https://github.com/kubernetes/node-problem-detector.
However, I’m facing a few issues:

  1. There are no logs in any of the pods of npd. I tried deleting pods from the node and even restart the node. None of the events have been logged.
  2. There was a mention of using Kubernetes Exporter, however, I do not have much information about how to set up the exporter.

Cluster information:

Kubernetes version: 1.21
Cloud being used: public cloud (IBM Cloud)

npd.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

  name: node-problem-detector

  namespace: kube-system

  labels:

    kubernetes.io/cluster-service: "true"

    addonmanager.kubernetes.io/mode: Reconcile

---

apiVersion: rbac.authorization.k8s.io/v1

kind: ClusterRoleBinding

metadata:

  name: npd-binding

  labels:

    kubernetes.io/cluster-service: "true"

    addonmanager.kubernetes.io/mode: Reconcile

roleRef:

  apiGroup: rbac.authorization.k8s.io

  kind: ClusterRole

  name: system:node-problem-detector

subjects:

- kind: ServiceAccount

  name: node-problem-detector

  namespace: kube-system

---

apiVersion: apps/v1

kind: DaemonSet

metadata:

  name: npd-v0.8.9

  namespace: kube-system

  labels:

    k8s-app: node-problem-detector

    version: v0.8.9

    kubernetes.io/cluster-service: "true"

    addonmanager.kubernetes.io/mode: Reconcile

spec:

  selector:

    matchLabels:

      k8s-app: node-problem-detector

      version: v0.8.9

  template:

    metadata:

      labels:

        k8s-app: node-problem-detector

        version: v0.8.9

        kubernetes.io/cluster-service: "true"

    spec:

      containers:

      - name: node-problem-detector

        image: k8s.gcr.io/node-problem-detector/node-problem-detector:v0.8.9

        command:

        - "/bin/sh"

        - "-c"

        - "exec /node-problem-detector --logtostderr --config.system-log-monitor=/config/kernel-monitor.json,/config/docker-monitor.json,/config/systemd-monitor.json --config.custom-plugin-monitor=/config/kernel-monitor-counter.json,/config/systemd-monitor-counter.json --config.system-stats-monitor=/config/system-stats-monitor.json >>/var/log/node-problem-detector.log 2>&1"

        securityContext:

          privileged: true

        resources:

          limits:

            cpu: "200m"

            memory: "100Mi"

          requests:

            cpu: "20m"

            memory: "20Mi"

        env:

        - name: NODE_NAME

          valueFrom:

            fieldRef:

              fieldPath: spec.nodeName

        volumeMounts:

        - name: log

          mountPath: /var/log

        - name: config

          mountPath: /config

          readOnly: true

        - name: localtime

          mountPath: /etc/localtime

          readOnly: true

      volumes:

      - name: log

        hostPath:

          path: /var/log/

      - name: config

        configMap:

          name: node-problem-detector-config

      - name: localtime

        hostPath:

          path: /etc/localtime

          type: "FileOrCreate"

      serviceAccountName: node-problem-detector

      tolerations:

        - operator: "Exists"

          effect: "NoExecute"

        - key: "CriticalAddonsOnly"

          operator: "Exists"