The Architectural Conflict Between cluster.local and Linux NSSwitch/mDNS Isolation Policies

Hi community,

I want to raise awareness about a proven architectural conflict between Kubernetes’ default service domain (cluster.local) and the standard Linux Name Service Switch (nsswitch.conf) when implementing strict mDNS security or network isolation policies.

:hammer_and_wrench: The Proof of Conflict

In environments requiring strict local network privacy (such as edge computing, IoT, or hybrid bare-metal clusters), it is a standard practice to enforce mDNS and prevent local queries from leaking to public or upstream DNS servers.

To achieve this, we configured a strict DNS isolation policy inside a standard Debian-slim container using /etc/nsswitch.conf:

Plaintext

hosts:          files mdns4_minimal [NOTFOUND=return]

The expected Linux behavior is definitive:

  1. Check /etc/hosts (files).

  2. Broadcast via mDNS (mdns4_minimal) for link-local names.

  3. If not found, immediately abort and return an error ([NOTFOUND=return]) to block regular DNS fallback. This is crucial for privacy and preventing traffic leakage.

The Failure: Once this standard Linux policy is applied, all Kubernetes cluster service discovery breaks instantly. Running curl light-http-service.default.svc.cluster.local drops immediately with Could not resolve host.

Why it happens: Because Kubernetes uses .cluster.local as its default domain, the Glibc resolver treats it as an mDNS domain under RFC 6762. The mdns4_minimal plugin attempts a multicast broadcast, fails to find the cluster service on the local link, and the [NOTFOUND=return] rule triggers a hard stop. The resolver completely ignores CoreDNS listed in /etc/resolv.conf, effectively blinding the container to the K8s control plane.

:face_with_monocle: The Core Issue

This is not a bug in Linux, nor is it a bug in CoreDNS—it is a naming collision inherent to Kubernetes’ default design. By adopting .local for a centralized, cluster-wide DNS architecture, Kubernetes directly collides with the IETF standard (RFC 6762) which mandates .local strictly for link-local multicast.

While advanced users can override clusterDomain to .cluster.internal via kubeadm at bootstrap, .cluster.local remains the out-of-the-box default for 99% of the ecosystem.

:speech_balloon: Discussion Points

  1. As Kubernetes continues to expand into IoT and edge environments where mDNS/Avahi coexistence is mandatory, should the default cluster domain be changed to a non-colliding suffix (like .cluster.internal or .k8s) in future major releases?

  2. How are edge/hybrid cluster operators managing this collision today without forcing cluster-wide domain rewrites?

I look forward to hearing your insights on this fundamental naming conflict!

Hi,

Good idea, and I kind of get the thought behind it.

But AFAIK, we can always work this by updating the Corefile in the coreDNS configMap.
It’s a simple configuration change that we need to do meet the use case.

So I am not sure if kubernetes or CoreDNS needs to change from the default cluster.local domain.

Thanks,
Manan

Resolved (2024.07.29.06), the Board reserves .INTERNAL from delegation in the DNS root zone permanently to provide for its use in private-use applications. The Board recommends that efforts be undertaken to raise awareness of its reservation for this purpose through the organization’s technical outreach.