K8s on premise - coredns and clusterissuer problems

Hi all
I followed instruction at : https://computingforgeeks.com/deploy-kubernetes-cluster-on-ubuntu-with-kubeadm/ to install k8s cluster on premise on Ubuntu 20.04 with kubeadm

Cluster information:

kubectl version --short
Client Version: v1.26.2
Kustomize Version: v4.5.7
Server Version: v1.26.2

kubectl cluster-info
Kubernetes control plane is running at https://k8s.mydomain.com:6443
CoreDNS is running at https://k8s.mydomain.com:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

Cloud being used: bare-metal)
Installation method:

kubeadm init --v=5 --pod-network-cidr=10.244.0.0/16 --upload-certs --control-plane-endpoint=k8s.mydomain.com

Host OS: Ubuntu 22.04
CNI and version: Flannel latest
CRI and version: CRI-O latest

Problem 1 : coredns pods cannot start

kubectl get pod -n kube-system
NAME                                       READY   STATUS                 RESTARTS   AGE
coredns-787d4945fb-g8hhj                   0/1     CreateContainerError   0          7m1s
coredns-787d4945fb-nkssp                   0/1     CreateContainerError   0          6m29s
etcd-k8s-52ts-master1                      1/1     Running                1          12h
kube-apiserver-k8s-52ts-master1            1/1     Running                1          12h
kube-controller-manager-k8s-52ts-master1   1/1     Running                1          12h
kube-proxy-m4mhc                           1/1     Running                1          12h
kube-proxy-rrdpb                           1/1     Running                0          12h
kube-scheduler-k8s-52ts-master1            1/1     Running                1          12h

kubectl describe pod coredns-787d4945fb-g8hhj -n kube-system
Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  65s   default-scheduler  Successfully assigned kube-system/coredns-787d4945fb-g8hhj to k8s-52ts-worker1
  Warning  Failed     64s   kubelet            Error: container create failed: time="2023-03-17T08:26:01+07:00" level=warning msg="unable to get oom kill count" error="openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-105d630363803602c6bfe9c1516b128192b95c519ff321ed1177fe0cacdc9b42.scope/memory.events: no such file or directory"
time="2023-03-17T08:26:01+07:00" level=error msg="container_linux.go:380: starting container process caused: exec: \"/coredns\": stat /coredns: no such file or directory"
  Warning  Failed  63s  kubelet  Error: container create failed: time="2023-03-17T08:26:02+07:00" level=warning msg="unable to get oom kill count" error="openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-2bbebc0f6e74a3c5a8f5e460e758e322a94ffe97964f34e357d831ce3fced345.scope/memory.events: no such file or directory"
time="2023-03-17T08:26:02+07:00" level=error msg="container_linux.go:380: starting container process caused: exec: \"/coredns\": stat /coredns: no such file or directory"
  Warning  Failed  47s               kubelet  Error: container create failed: time="2023-03-17T08:26:18+07:00" level=error msg="container_linux.go:380: starting container process caused: exec: \"/coredns\": stat /coredns: no such file or directory"
  Warning  Failed  34s               kubelet  Error: container create failed: time="2023-03-17T08:26:31+07:00" level=error msg="container_linux.go:380: starting container process caused: exec: \"/coredns\": stat /coredns: no such file or directory"
  Warning  Failed  19s               kubelet  Error: container create failed: time="2023-03-17T08:26:46+07:00" level=error msg="container_linux.go:380: starting container process caused: exec: \"/coredns\": stat /coredns: no such file or directory"
  Normal   Pulled  9s (x6 over 64s)  kubelet  Container image "registry.k8s.io/coredns/coredns:v1.9.3" already present on machine

I installed MetalLB successfully

kubectl get all -n metallb-system
NAME                              READY   STATUS    RESTARTS   AGE
pod/controller-68bf958bf9-crsws   1/1     Running   0          105s
pod/speaker-n5fl7                 1/1     Running   0          105s
pod/speaker-qj552                 1/1     Running   0          105s

NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/webhook-service   ClusterIP   10.100.0.100   <none>        443/TCP   106s

NAME                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/speaker   2         2         2       2            2           kubernetes.io/os=linux   105s

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/controller   1/1     1            1           106s

kubectl get ipaddresspools.metallb.io -n metallb-system
NAME       AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
k8s   true          false             ["192.168.7.190-192.168.7.195"]

kubectl get l2advertisements.metallb.io -n metallb-system
NAME        IPADDRESSPOOLS   IPADDRESSPOOL SELECTORS   INTERFACES
l2-advert

kubectl describe ipaddresspools.metallb.io k8s -n metallb-system
Name:         k8s
Namespace:    metallb-system
Labels:       <none>
Annotations:  <none>
API Version:  metallb.io/v1beta1
Kind:         IPAddressPool
Metadata:
  Creation Timestamp:  2023-03-16T13:27:25Z
  Generation:          1
  Managed Fields:
    API Version:  metallb.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:addresses:
        f:autoAssign:
        f:avoidBuggyIPs:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2023-03-16T13:27:25Z
  Resource Version:  3744
  UID:               3e7ebe10-1073-4faf-9da0-3cdc98e659ad
Spec:
  Addresses:
    192.168.7.190-192.168.7.195
  Auto Assign:       true
  Avoid Buggy I Ps:  false
Events:              <none>

I installed ingress-nginx successfully

kubectl get all -n ingress-nginx
NAME                                           READY   STATUS    RESTARTS   AGE
pod/ingress-nginx-controller-c69664497-qpdct   1/1     Running   0          2m16s

NAME                                         TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
service/ingress-nginx-controller             LoadBalancer   10.103.198.89   192.168.7.190   80:32500/TCP,443:32175/TCP   2m16s
service/ingress-nginx-controller-admission   ClusterIP      10.102.57.244   <none>          443/TCP                      2m16s

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-controller   1/1     1            1           2m16s

NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-controller-c69664497   1         1         1       2m16s

I installed a demo web app and created a ingress rule to public web app, I can access web app from outside k8s cluster through “192.168.7.190” successfully .

I installed cert-manager successfully

kubectl get all -n cert-manager
NAME                                          READY   STATUS    RESTARTS   AGE
pod/cert-manager-6ffb79dfdb-9sq7g             1/1     Running   0          100s
pod/cert-manager-cainjector-5fcd49c96-kcl7b   1/1     Running   0          101s
pod/cert-manager-webhook-796ff7697b-4cxkm     1/1     Running   0          100s

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/cert-manager           ClusterIP   10.105.156.184   <none>        9402/TCP   101s
service/cert-manager-webhook   ClusterIP   10.106.211.161   <none>        443/TCP    101s

NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cert-manager              1/1     1            1           101s
deployment.apps/cert-manager-cainjector   1/1     1            1           101s
deployment.apps/cert-manager-webhook      1/1     1            1           100s

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/cert-manager-6ffb79dfdb             1         1         1       101s
replicaset.apps/cert-manager-cainjector-5fcd49c96   1         1         1       101s
replicaset.apps/cert-manager-webhook-796ff7697b     1         1         1       100s

Problem 2 : clusterissuer is not ready

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
  namespace: cert-manager
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: myemail@mydomain.com
    privateKeySecretRef:
      name: letsencrypt-production
    solvers:
    - http01:
        ingress:
          class: nginx

kubectl apply -f issuer-letsencrypt-production.yaml
clusterissuer.cert-manager.io/letsencrypt-production created

But it is not READY

kubectl get clusterissuer -n cert-manager
NAME                     READY   AGE
letsencrypt-production   False   33s

kubectl describe clusterissuer/letsencrypt-production -n cert-manager
Name:         letsencrypt-production
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  cert-manager.io/v1
Kind:         ClusterIssuer
Metadata:
  Creation Timestamp:  2023-03-16T13:50:31Z
  Generation:          1
  Managed Fields:
    API Version:  cert-manager.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:acme:
          .:
          f:email:
          f:privateKeySecretRef:
            .:
            f:name:
          f:server:
          f:solvers:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2023-03-16T13:50:31Z
    API Version:  cert-manager.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:acme:
        f:conditions:
          .:
          k:{"type":"Ready"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:observedGeneration:
            f:reason:
            f:status:
            f:type:
    Manager:         cert-manager-clusterissuers
    Operation:       Update
    Subresource:     status
    Time:            2023-03-16T13:55:48Z
  Resource Version:  6957
  UID:               a39ecd1d-2a74-453c-89bb-548db12ed901
Spec:
  Acme:
    Email:            myemail@mydomain.com
    Preferred Chain:
    Private Key Secret Ref:
      Name:  letsencrypt-production
    Server:  https://acme-v02.api.letsencrypt.org/directory
    Solvers:
      http01:
        Ingress:
          Class:  nginx
Status:
  Acme:
  Conditions:
    Last Transition Time:  2023-03-16T13:50:41Z
    Message:               Failed to register ACME account: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:50165->10.96.0.10:53: read: connection refused
    Observed Generation:   1
    Reason:                ErrRegisterACMEAccount
    Status:                False
    Type:                  Ready
Events:
  Type     Reason         Age                  From                         Message
  ----     ------         ----                 ----                         -------
  Warning  ErrInitIssuer  5m12s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:37136->10.96.0.10:53: i/o timeout
  Warning  ErrInitIssuer  5m2s                 cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:52820->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  4m52s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:58875->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  4m37s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:46895->10.96.0.10:53: i/o timeout
  Warning  ErrInitIssuer  4m27s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:43995->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  4m17s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:44989->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  4m1s                 cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:45239->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  3m51s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:43369->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  3m41s                cert-manager-clusterissuers  Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:46797->10.96.0.10:53: read: connection refused
  Warning  ErrInitIssuer  5s (x18 over 3m26s)  cert-manager-clusterissuers  (combined from similar events): Error initializing issuer: Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org on 10.96.0.10:53: read udp 10.244.1.8:50165->10.96.0.10:53: read: connection refused

How can I troubleshoot and fix these 2 problem ? Please give me some advice , thank you very much.