About PreStop hook for Daemonset

Hi.
Let me propose a new feature for Kubernetes Daemonset resource.

Currently, Kubernetes Daemonset does not support PreStop lifecycle hook, in my recognition. I understand Daemonset is designed to run on every node. However, this sometimes bothers us when we want Daemonset to do some tasks before the scale-in of nodes.

For example, there is one Daemonset that does some batch jobs for pods running at the same node. Then, the cluster did the scale-out by cluster-autoscaler, and some minutes later, did the scale-in. In this case, we want to guarantee that the last job of the Daemonset finished successfully, and to achieve that, we want to run the PreStop hook. However, unfortunately, this requirement is not easy in the current Kubernetes version. Kubernetes’ scale-in process kills Daemonset pods instantly without PreStop hook. Replicaset’s pods are terminated correctly though.

I understand Daemonset is guaranteed to run on each node, so doing the termination process for Daemonset before stopping a node creates a paradox by definition.

Nevertheless, I believe Daemonset should be processed by PreStop hook when stopping a node.

Let me hear your opinion.

1 Like

This is pretty critical for a use case we have. The DaemonSet needs to deregister the node’s public IP from DNS and allow some time for (most) DNS caches to expire. Without a prestop hook or honoring of terminationGracePeriodSeconds (which also doesn’t happen) we have obvious negative consequences.

In my testing, I believe it works for

  • Deployments when the node is shutting down.
  • DaemonSets when a taint is used to kick the DaemonSet pod off a node.

It’s just so close to working it’s frustrating : (

I will admit that I’m not super familiar with the code in this area, but what you’re describing doesn’t sound right. Kubelet doesn’t really know the difference between a Deployment pod and a DaemonSet pod. They are both just pods, and both should be eligible for the full feature set.

Without an exact repro case, it’s hard to say what might be going wrong. When I have seen problem reports in this area in the past, it has usually been that the container itself is exiting early. E.g. on SIGTERM.

If you have a simple, but complete, reproduction, it might be worth opening an issue against the GitHub repo. Discuss is not a place to file bugs.