We have migrated kubernetes VMs from AWS to Azure,and after the migration we are seeing calico and coredns pods are in unknow state. Could someone please post your suggestions

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version:v1.30.8
Cloud being used: (put bare-metal if not on a public cloud)
Installation method:
Host OS:
CNI and version:“0.3.1”
CRI and version: v1 1.7.25

[root@master01 net.d]# kubectl get pods -n kube-system -o wide
NAME                                           READY   STATUS    RESTARTS       AGE   IP         NODE                   NOMINATED NODE   READINESS GATES
calico-kube-controllers-564985c589-m6ttv       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
calico-node-lzcl8                              0/1     Unknown   12             17d   10.5.0.5   master01.example.com   <none>           <none>
calico-node-x5284                              0/1     Unknown   8              17d   10.5.0.4   worker01.example.com   <none>           <none>
coredns-55cb58b774-knfbm                       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
coredns-55cb58b774-lj5mm                       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
etcd-master01.example.com                      1/1     Running   2 (63m ago)    23h   10.5.0.5   master01.example.com   <none>           <none>
kube-apiserver-master01.example.com            1/1     Running   2 (63m ago)    23h   10.5.0.5   master01.example.com   <none>           <none>
kube-controller-manager-master01.example.com   1/1     Running   16 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
kube-proxy-l8shz                               1/1     Running   15 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
kube-proxy-p28ff                               1/1     Running   11 (64m ago)   17d   10.5.0.4   worker01.example.com   <none>           <none>
kube-scheduler-master01.example.com            1/1     Running   16 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
[root@master01 net.d]#
[root@master01 net.d]# kubectl get pods -o wide
NAME                          READY   STATUS              RESTARTS   AGE     IP       NODE                   NOMINATED NODE   READINESS GATES
app-server-5f7d8f6d56-6vk68   0/1     Unknown             1          2d4h    <none>   worker01.example.com   <none>           <none>
app-server-5f7d8f6d56-x2zmc   0/1     Unknown             1          2d4h    <none>   worker01.example.com   <none>           <none>
debug                         0/1     ContainerCreating   0          5m54s   <none>   worker01.example.com   <none>           <none>
[root@master01 net.d]# kubectl get nodes -o wide
NAME                   STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                               KERNEL-VERSION
              CONTAINER-RUNTIME
master01.example.com   Ready    control-plane   17d   v1.30.8   10.5.0.5      <none>        Red Hat Enterprise Linux 8.4 (Ootpa)   4.18.0-305.82.1.el8_4.x86_64   containerd://1.7.25
worker01.example.com   Ready    <none>          17d   v1.30.8   10.5.0.4      <none>        Red Hat Enterprise Linux 8.4 (Ootpa)   4.18.0-305.82.1.el8_4.x86_64   containerd://1.7.25
[root@master01 net.d]#

[root@master01 net.d]# uname -a
Linux master01.example.com 4.18.0-305.82.1.el8_4.x86_64 #1 SMP Thu Feb 23 09:21:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@master01 net.d]#

You can format your yaml by highlighting it and pressing Ctrl-Shift-C, it will make your output easier to read.


We see the pods going to unknown state after we have updated the clutser config files with azure VM IPs.

[root@master01 net.d]# kubectl get pods -n kube-system -o wide
NAME                                           READY   STATUS    RESTARTS       AGE   IP         NODE                   NOMINATED NODE   READINESS GATES
calico-kube-controllers-564985c589-m6ttv       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
calico-node-lzcl8                              0/1     Unknown   12             17d   10.5.0.5   master01.example.com   <none>           <none>
calico-node-x5284                              0/1     Unknown   8              17d   10.5.0.4   worker01.example.com   <none>           <none>
coredns-55cb58b774-knfbm                       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
coredns-55cb58b774-lj5mm                       0/1     Unknown   12             17d   <none>     master01.example.com   <none>           <none>
etcd-master01.example.com                      1/1     Running   2 (63m ago)    23h   10.5.0.5   master01.example.com   <none>           <none>
kube-apiserver-master01.example.com            1/1     Running   2 (63m ago)    23h   10.5.0.5   master01.example.com   <none>           <none>
kube-controller-manager-master01.example.com   1/1     Running   16 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
kube-proxy-l8shz                               1/1     Running   15 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
kube-proxy-p28ff                               1/1     Running   11 (64m ago)   17d   10.5.0.4   worker01.example.com   <none>           <none>
kube-scheduler-master01.example.com            1/1     Running   16 (63m ago)   17d   10.5.0.5   master01.example.com   <none>           <none>
[root@master01 net.d]#

Node information says the status as ready.

[root@master01 net.d]# kubectl get nodes -o wide
NAME                   STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                               KERNEL-VERSION
              CONTAINER-RUNTIME
master01.example.com   Ready    control-plane   17d   v1.30.8   10.5.0.5      <none>        Red Hat Enterprise Linux 8.4 (Ootpa)   4.18.0-305.82.1.el8_4.x86_64   containerd://1.7.25
worker01.example.com   Ready    <none>          17d   v1.30.8   10.5.0.4      <none>        Red Hat Enterprise Linux 8.4 (Ootpa)   4.18.0-305.82.1.el8_4.x86_64   containerd://1.7.25
[root@master01 net.d]#

Kubernetes version:v1.30.8
Cloud being used: (put bare-metal if not on a public cloud) Azure
Installation method: migration of vm from aws to azure
Host OS: redhat8

[root@master01 net.d]# uname -a
Linux master01.example.com 4.18.0-305.82.1.el8_4.x86_64 #1 SMP Thu Feb 23 09:21:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@master01 net.d]#
 network setup issue with your Kubernetes pod, specifically related to the Calico network plugin. The error message indicates a timeout when trying to reach the ClusterInformation resource.

[root@master01 ~]# kubectl describe pod coredns-798cd4d5bb-btww5 -n kube-system
 desc = failed to setup network for sandbox "69a5c7d1534ef0867d279913c0a99541cda34039530774e81e6d89b9ee0e9063": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: i/o timeout
  Normal   SandboxChanged          7s (x5 over 4m11s)   kubelet  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  7s                   kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "bfc588149b3414c3cbfafd70442b274cdb7498442903392f3231cc3d23d304fd": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: i/o timeout

[root@master01 ~]# kubectl get svc
NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
app-service   NodePort    10.96.202.188   <none>        80:32000/TCP   22d
kubernetes    ClusterIP   10.96.0.1       <none>        443/TCP        22d
[root@master01 ~]#

We are unable to reach the cluster IP on port 443.

[root@master01 ~]#  curl -k https://10.96.0.1:443
curl: (7) Failed to connect to 10.96.0.1 port 443: Connection timed out
[root@master01 ~]#

Dear Experts,

Could some one please suggest us what config or changes need to be done for the calico and coredns pods to start working.

Thanks in Advance.