I have “inherited” a bare-metal kubernetes cluster installed with kubeadm consisting of a control plane and several workers (Ubuntu 20.04 LTS). At the moment I’m trying to work on a VM template (Ubuntu 22.04) to serve as base for a batch of new workers. At the moment I’ve hooked it up to the cluster, tagged it and ran some pods on it to test that everything runs fine before duplication; however I have run into a very peculiar issue:
The template VM itself runs and connects to the cluster perfectly fine. Control Plane gets readiness and status updates without issues.
The test pods I’ve used, however, cannot establish a connection to any target outside the cluster. This manifests mainly in failure in DNS resolution (but going for IPs directly also leads to complete packet loss).
Trying to ping/curl from the VM itself works without issues.
Trying to ping/curl from the same pod, but scheduled on one of the old workers works without issues.
I’ve also tried starting the same container directly on the VM through docker and it works without issues.
It seems that only when the container is run through kubernetes and scheduled on this specific node that things fail, and I cannot figure out why.
The Cluster runs coredns and I’ve already tried adding a forwad to a “proper” DNS ip, even trying to add the initially problematic URL into the hosts section directly. I’ve also tried deploying the pod both with ClusterFirst and Default dnsPolicy, without luck.
Does anyone else have an idea what could be the issue behind this? At this point I’m rather stumped.