K8s daemonset is in running state even when the respective node is down

amitkatyal · August 17, 2021, 4:33pm

We are observing an issue where the k8s daemonset is in a running state even when the respective node is down. I would have expected the daemonset on the down node to get evicted from that node. However, since we are exposing the daemonset via the headless service, DNS query to the headless service is returning the IP’s of all the daemon sets (including the one corresponding to the down node) due to which application pod is not able to connect to the daemon set corresponding to the down node and resulting into traffic interruption.

Cluster information:

Kubernetes version: 1.21.1
Cloud being used: bare-metal
Installation method: K3s
Host OS: Ubuntu 18.04
CNI and version:
CRI and version:

protosam · August 17, 2021, 10:54pm

How long did you wait after the node was down? The heartbeat for nodes is like 5 minutes. Things take some time to correct themselves when a node abruptly dies.

Do you have health checks configured for the daemonset? While I’ve not tested this, I believe health checks happen from the control plane. My assumption is that a failing health check on the pod could cause it to be handled faster than just waiting for node recovery things to happen.

amitkatyal · August 18, 2021, 2:06am

Yes, We waited long enough & do have health checks enabled. In fact, we didn’t restart the node to check the behavior and observed that all the daemonset pods remains in the running state.

The main concern is that the headless service returning the IP address of the POD that doesn’t
exists. We have the same daemon exposed via cluster IP & headless service. The cluster IP service works as expected & doesn’t forward the traffic to the pod that doesn’t exist but the DNS query to the headless service returns the IP address of the pod which doesn’t exist due to which application load balancing logic is going for a toss. Any thoughts/suggestions to kick out the daemonset if the node is not ready?

protosam · August 18, 2021, 3:46am

Perhaps with tollerations?

I came across this from a stack overflow post.

Read the documentation carefully though, it seems DaemonSets have some tolleration stuff set by default.

amitkatyal · August 18, 2021, 4:59am

Thanks, Yes, daemonset pods are not evicted due to the toleration added by default.
We can try fixing it by removing the tolerations but it would require monitoring the daemonset pod deployments to automate it.

Do you think k8s headless service returning the endpoint which is not reachable is an issue and should be fixed in the k8s?

protosam · August 18, 2021, 5:08am

Looking at headless services it just kindof says that DNS happens, not really why DNS happens.

When I check out the ServiceSpec API Reference, the setting “publishNotReadyAddresses” strikes me as interesting.

What’s the value of “publishNotReadyAddresses” in your cluster and what’s the status of the pod on the dead node?

amitkatyal · August 18, 2021, 8:37am

Thanks for the update!
It seems the default value of publishNotReadyAddresses is false but it is still returning the IP address of the pod on the dead node. I tried service.alpha.kubernetes.io/tolerate-unready-endpoints: “false” annotation as well but no luck.

Describe service does show that not ready pod but DNS query resolve the not ready pod IP address as well. Please see below.

kubectl describe endpoints opa-headless -n ztna
Name: opa-headless
Namespace: ztna
Labels: app.kubernetes.io/managed-by=Helm
service=opa
service.kubernetes.io/headless=
Annotations:
Subsets:
Addresses: 10.42.0.10,10.42.2.6
NotReadyAddresses: 10.42.1.4
Ports:
Name Port Protocol
---- ---- --------
9191 9191 TCP

Events:

DNS resolution logs
[2021-08-18 08:37:09.901][1][debug][upstream] [source/common/upstream/upstream_impl.cc:279] transport socket match, socket default selected for host with address 10.42.2.6:9191

[2021-08-18 08:37:09.901][1][debug][upstream] [source/common/upstream/upstream_impl.cc:279] transport socket match, socket default selected for host with address 10.42.1.4:9191

[2021-08-18 08:37:09.901][1][debug][upstream] [source/common/upstream/upstream_impl.cc:279] transport socket match, socket default selected for host with address 10.42.0.10:9191

[2021-08-18 08:37:09.901][1][debug][upstream] [source/common/upstream/strict_dns_cluster.cc:170] DNS refresh rate reset for opa-headless, refresh rate 5000 ms

Topic		Replies	Views
Why "kubectl get pods" show pod still running while its node was poweroff? General Discussions	3	3816	March 20, 2021
How to detect readiness when node becomes unavailable? General Discussions	1	2198	July 26, 2019
Traffic to a Pod located in a Dead Node General Discussions	2	1780	August 23, 2019
Pods show running.... but node was shut down 10 minutes ago General Discussions	3	945	January 17, 2020
Node down - pods shown still as Running for hours, others stuck in Terminating General Discussions	5	9124	August 4, 2022

K8s daemonset is in running state even when the respective node is down

Cluster information:

Related topics