Is it normal to be unable to curl a Kubernetes service's ClusterIP from inside a pod?

Hi everyone! :wave:

Cluster information:

Kubernetes version: v1.31.0
Cloud being used: vps hetzner
Installation method: kubeadm
Host OS: ubuntu 24.04
CNI and version: flannel - v1.5.1-flannel2
CRI and version: containerd - crictl version v1.31.1


I’m running into an issue in my Kubernetes cluster where I can’t curl a service’s ClusterIP from inside a pod. The service itself is up and running — I can see it when I run kubectl get svc — but whenever I try to access it using curl, the request just times out.

I’m not sure if this is expected behavior or if there might be a misconfiguration somewhere in my cluster setup. :thinking:

Here’s some context:
The service is of type ClusterIP.

I can access the service by curling the pod’s IP directly, but not through the service’s ClusterIP. :x:

DNS resolution works fine — nslookup returns the correct IP for the service. :white_check_mark:

There are no NetworkPolicies in place that would restrict traffic in the namespace. :no_entry_sign:

Has anyone encountered something like this before? Any insights or advice would be greatly appreciated! :pray:


I ran the following simple test to try and understand the issue:

kubectl run curl-test --rm -i --tty --image=curlimages/curl – /bin/sh

If you don’t see a command prompt, try pressing enter.
~ $ nslookup 10.103.60.54
Server: 10.96.0.10
Address: 10.96.0.10:53

54.60.103.10.in-addr.arpa name = hello-web.apps.svc.cluster.local

~ $ curl http://hello-web.apps.svc.cluster.local:80
curl: (28) Failed to connect to hello-web.apps.svc.cluster.local port 80 after 135674 ms: Could not connect to server
~ $ curl http://10.103.60.54:80
curl: (28) Failed to connect to 10.103.60.54 port 80 after 132964 ms: Could not connect to server


According to the official Kubernetes documentation:

Services
A/AAAA records
“Normal” (not headless) Services are assigned DNS A and/or AAAA records, depending on the IP family or families of the Service, with a name of the form my-svc.my-namespace.svc.cluster-domain.example. This resolves to the cluster IP of the Service.

Based on this, I understand that I should be able to use curl http://hello-web.apps.svc.cluster.local:80 to reach the service, as it resolves correctly in the DNS lookup.

However, both the DNS name and the ClusterIP return the same error when attempting to curl the service.


Here is the service and deployment manifest I’m using:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-web
  namespace: apps
  labels:
    app: hello-web
spec:
  selector:
    matchLabels:
      app: hello-web
  replicas: 1
  template:
    metadata:
      labels:
        app: hello-web
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: hello-web
  labels:
    run: hello-web
  namespace: apps
spec:
  ports:
  - port: 80
    protocol: TCP
  selector:
    app: hello-web


What I expected:

I expected that the curl requests to either the service name (hello-web.apps.svc.cluster.local) or the ClusterIP (10.103.60.54) would successfully reach the Nginx container running in the pod. :hammer_and_wrench: Since DNS resolves correctly and the service looks properly configured, I thought this would work smoothly.

However, the requests are timing out. :confused:
I’m not sure if it’s a configuration issue on my side or if there’s something missing in the Kubernetes network setup.
Or… could this even be the default behavior and maybe it wasn’t supposed to work this way in the first place? :man_shrugging:

Any help or insights would be awesome! :pray:

1 Like

Solution Recap:

:dart: After some guidance, I was able to resolve the issue! The problem stemmed from a mismatch in the configuration between Flannel (CNI) and Kube-Proxy.

Here’s what I learned:

The Issue:

  • Flannel requires masquerading to be enabled (masquerade: true).
  • By default, Kube-Proxy has masqueradeAll: false, which caused the conflict.

The Fix:

  1. :hammer_and_wrench: Edit the Kube-Proxy ConfigMap:
  • Run: kubectl -n kube-system edit cm kube-proxy
  • Set masqueradeAll: true in the configuration.
  1. :arrows_counterclockwise: Restart the Kube-Proxy Pods:
  • Run: kubectl -n kube-system delete pod -l k8s-app=kube-proxy
  • This will restart all the Kube-Proxy pods with the updated configuration.

Key Takeaway:

:warning: This is a subtle and niche default configuration mismatch between Flannel and Kube-Proxy that can cause a lot of confusion. If you’re facing similar issues, this might be the fix you’re looking for!

2 Likes

i am still not understande why it not work. any wrong, please correct me, thanks.
Flannel’s masquerade: When Flannel’s masquerade is set to true, it will masquerade the source IP address of all packets leaving the pod network, replacing it with the node’s IP address.
Kube-Proxy’s masquerade-all: Kube-Proxy will masquerade all outbound traffic from the node, including service traffic. This means that the source IP address of the packets will be replaced with the node’s IP address.

in my understanding Flannel’s masquerade and Kube-Proxy’s masquerade-all can work seperalty.

so in your senerio: Flannel’s masquerade is true, Kube-Proxy’s masquerade-all is false. the curl a service’s ClusterIP fail,

the failure is because the target pod not receive request, or because the current pod not recive the reponce from target?

1 Like