Using minikube: DNS pods running but not ready

Cluster information:

Kubernetes version: minikube version: v1.6.2
commit: 54f28ac5d3a815d1196cd5d57d707439ee4bb392
Cloud being used: bare-metal
Installation method: curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
Host OS: Ubuntu 18.04
CNI and version:
CRI and version:

Problem:

I have a 3 docker images. My goal is to prepare proper helm chart and play with networking solutions so that I know how these 3 images should be launched in some customer cloud. For now I am playing around on my own ubuntu laptop and using minikube.

I start the cluster as below
sudo minikube start --feature-gates=SCTPSupport=true --vm-driver=none --alsologtostderr

Later I use kubectl apply and my .yaml file to deploy the containers/pods. They all start fine but I am unable to ping between them using their domain names. However, I can ping between them using their respective IPs assigned my the cluster.

I followed Debug DNS Resolution and found out the when I start minikube my DNS pods are not even properly running.

$ kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-5kqbc 0/1 Running 0 36s
coredns-6955765f44-zpkkp 0/1 Running 0 36s

Since there are known issues with ubuntu regarding inheritted DNS settings and recommendation is to use --resolv-conf flag for kubelet, I had to guess how to actually do it, so I started minikube like this

sudo minikube start --feature-gates=SCTPSupport=true --vm-driver=none --alsologtostderr --extra-config=kubelet.resolv-conf=/run/systemd/resolve/resolv.conf

But no success.

10.96.0.1 ?

Also a side question which seems a bit relevant. The cluster IP assigned is 10.96.0.1 which I haven’t specified anywhere. Where does minikube get this IP subnet from and why does it assigned that to my cluster?
It is also visible in output of minikube start

[Install]
config:
{KubernetesVersion:v1.17.0 NodeIP:100.87.6.72 NodePort:8443 NodeName:minikube APIServerName:minikubeCA APIServerNames: APIServerIPs: DNSDomain:cluster.local ContainerRuntime:docker CRISocket: NetworkPlugin: FeatureGates:SCTPSupport=true ServiceCIDR:10.96.0.0/12 ImageRepository: ExtraOptions:[{Component:kubelet Key:resolv-conf Value:/run/systemd/resolve/resolv.conf}] ShouldLoadCachedImages:false EnableDefaultCNI:false}

Why this is relevant to DNS is because I have seen /etc/resolve.conf files inside my pods once they are launched and nameserver IP seems to be set to an IP within this subnet but pods don’t have any route to it, so there is no way they would have been able to access the nameserver.

What does the describe output for your CoreDNS pods say? Also do you have a CNI deployed?

I am still pretty new to minikub and k8s so I might need a bit of a help to answer the questions.

Regarding CNI, I haven’t done anything specific other than start the cluster using minikube start with options that I listed in the query. Does that mean I am missing a step? Is CNI normally deployed when deploying a helm chart using kubectl apply

As for the coredns describe output, I did following

kubectl describe pod coredns-6955765f44-qgtbq --namespace=kube-system

This gave me following. Forum did not allow me to put more than 5 links so
<k8s_coredns_image> = k8s.gcr.io/coredns:1.6.5

22:15 $ kubectl describe pod coredns-6955765f44-qgtbq --namespace=kube-system
Name: coredns-6955765f44-qgtbq
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: minikube/192.168.10.244
Start Time: Tue, 04 Feb 2020 22:12:56 +0100
Labels: k8s-app=kube-dns
pod-template-hash=6955765f44
Annotations:
Status: Running
IP: 100.109.0.5
IPs:
IP: 100.109.0.5
Controlled By: ReplicaSet/coredns-6955765f44
Containers:
coredns:
Container ID: docker://b1a4564506c801533e4cc75d6ddef1a25fe009e6622b6b80ae7cb181ad0446a1
Image: <k8s_coredns_image>
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:7ec975f167d815311a7136c32e70735f0d00b73781365df1befd46ed35bd4fe7
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Tue, 04 Feb 2020 22:12:58 +0100
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-sjg5r (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-sjg5r:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-sjg5r
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 3m9s (x3 over 3m13s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod d
idn’t tolerate.
Normal Scheduled 3m4s default-scheduler Successfully assigned kube-system/coredns-6955765f44-qgtbq t
o minikube
Normal Pulled 3m2s kubelet, minikube Container image “<k8s_coredns_image>” already present o
n machine
Normal Created 3m2s kubelet, minikube Created container coredns
Normal Started 3m2s kubelet, minikube Started container coredns
Warning Unhealthy 2s (x18 over 2m52s) kubelet, minikube Readiness probe failed: HTTP probe failed with statuscode: 5
03

I also did following as per Debug DNS Resolution page. Clearly there is something not right with the 10.96.0.0 that is being used but I may be wrong.

kubectl logs --namespace=kube-system pod/coredns-6955765f44-qgtbq

E0204 21:12:58.578701 1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.
go:98: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:44
3: connect: no route to host