coreDNS not working. connection timed out is result of running nsloopkup Kubernetes service

Kubernetes version: 1.27.2
Cloud being used: bare-metal
Host OS: CentOS Linux release 7.9.2009 (Core)
Installation method: kubeadm
kubeadm init --apiserver-advertise-address=“172.16.XX.XX” --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/crio/crio.sock -v=5
CNI and version: flannel
Latest , image is docker.io/flannel/flannel:v0.22.0
CRI and version: cri-o://1.27.0
Error Description:
pods can not connect to the service by name. we did some investigation and we found the issue is related to coreDNS. We use the below guide from kubernetes for debugging DNS resolution but we can not find the reason.
Debugging DNS Resolution | Kubernetes

kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml

[superuser@master1 tst]$ kubectl get pods


NAME       READY   STATUS    RESTARTS   AGE
dnsutils   1/1     Running   1          31m

[superuser@master1 ~]$ kubectl exec -i -t dnsutils – nslookup kubernetes
Server: 10.96.0.10
Address: 10.96.0.10#53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

[superuser@master1 ~]$ kubectl exec -i -t dnsutils – nslookup kubernetes
;; connection timed out; no servers could be reached

command terminated with exit code 1
[superuser@master1 ~]$ kubectl exec -i -t dnsutils – nslookup kubernetes.default.svc.cluster.local
;; connection timed out; no servers could be reached

command terminated with exit code 1

More Info :

[superuser@master1 ~]$ kubectl get svc -A

NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  18h
kube-system   kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   18h

kubectl get endpoints

NAME         ENDPOINTS           AGE
kubernetes   172.16.25.31:6443   18h

[superuser@master1 ~]$ kubectl get pods -A

NAMESPACE      NAME                              READY   STATUS    RESTARTS      AGE
default        dnsutils                          1/1     Running   1             32m
kube-flannel   kube-flannel-ds-2mzkx             1/1     Running   3 (16m ago)   18h
kube-flannel   kube-flannel-ds-mgfzs             1/1     Running   2             18h
kube-flannel   kube-flannel-ds-ph9ss             1/1     Running   4 (16m ago)   18h
kube-system    coredns-7bb85699df-5tw22          1/1     Running   1             21m
kube-system    coredns-7bb85699df-w4qc5          1/1     Running   1             21m
kube-system    etcd-master1                      1/1     Running   2             18h
kube-system    kube-apiserver-master1            1/1     Running   2             18h
kube-system    kube-controller-manager-master1   1/1     Running   2             18h
kube-system    kube-proxy-2lhtp                  1/1     Running   2             18h
kube-system    kube-proxy-5kbzb                  1/1     Running   3             18h
kube-system    kube-proxy-g9dj9                  1/1     Running   2             18h
kube-system    kube-scheduler-master1            1/1     Running   2             18h

kubectl get nodes

NAME       STATUS   ROLES           AGE   VERSION
master1    Ready    control-plane   18h   v1.27.2
worker01   Ready    <none>          18h   v1.27.2
worker02   Ready    <none>          18h   v1.27.2

Logs of coredns:
[superuser@master1 ~]$ kubectl logs coredns-7bb85699df-5tw22 -n kube-system
.:53
[INFO] plugin/reload: Running configuration SHA512 = c0af6acba93e75312d34dc3f6c44bf8573acff497d229202a4a49405ad5d8266c556ca6f83ba0c9e74088593095f714ba5b916d197aa693d6120af8451160b80
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
[INFO] 127.0.0.1:48759 - 15482 “HINFO IN 3425859548094321241.2531221399407792863. udp 57 false 512” NXDOMAIN qr,rd,ra 132 4.987843721s
[INFO] 127.0.0.1:42224 - 17502 “HINFO IN 3425859548094321241.2531221399407792863. udp 57 false 512” NXDOMAIN qr,rd,ra 132 1.986966209s
[INFO] 10.244.1.14:53102 - 39388 “A IN kubernetes.default.svc.cluster.local. udp 54 false 512” NOERROR qr,aa,rd 106 0.00049116s
[INFO] 10.244.1.14:48134 - 2402 “A IN kubernetes.default.svc.cluster.local.default.svc.cluster.local. udp 80 false 512” NXDOMAIN qr,aa,rd 173 0.000342886s
[INFO] 10.244.1.14:34341 - 46790 “A IN kubernetes.default.default.svc.cluster.local. udp 62 false 512” NXDOMAIN qr,aa,rd 155 0.000570933s
[superuser@master1 ~]$ kubectl logs coredns-7bb85699df-w4qc5 -n kube-system
.:53
[INFO] plugin/reload: Running configuration SHA512 = c0af6acba93e75312d34dc3f6c44bf8573acff497d229202a4a49405ad5d8266c556ca6f83ba0c9e74088593095f714ba5b916d197aa693d6120af8451160b80
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
[INFO] 127.0.0.1:48413 - 5861 “HINFO IN 8045793619456579501.3836722543850630215. udp 57 false 512” NXDOMAIN qr,rd,ra 132 4.928481093s
[INFO] 127.0.0.1:55948 - 37614 “HINFO IN 8045793619456579501.3836722543850630215. udp 57 false 512” NXDOMAIN qr,rd,ra 132 1.928028205s

Firewall is running at the master and the nodes
firewall-cmd --list-ports

6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp

at the worker nodes

10250/tcp 30000-32767/tcp 179/tcp

we will use this environment later as a production to run a high availability cluster with 3 masters nodes, metallb and ingress.

1 Like

The issue is resolved by using an older version of flannel.