How do we write an ingress NetworkPolicy object in kubernetes, so that calls via api server proxy can come through?

Cluster information:

Kubernetes version: v1.16.4
Cloud being used: vSphere, on prem
Installation method: kubeadm
Host OS: Ubuntu 18.04
CNI and version: Canal 3.8
CRI and version: Docker 18.09

I encountered this problem, when trying secure kubernetes dashboard with Network Policies. At the moment of writing, the dashboard consists of two pods: the dashboard itself and the metrics scraper. The dashboard periodically calls the metrics scraper, and thus needs access to it.

It appears, that the way the dashboard accesses the metrics scraper is via the API server proxy. It also appears, the pod and namespace selectors in a NetworkPolicy object do nothing to allow that type of access. I was able to use ipBlock of 0.0.0.0/0 in order to give access, but this defeats the purpose of having a NetworkPolicy. I would like to be a lot more concrete with what Iā€™m allowing access from.

One peculiar thing, is that when access is working, the incoming IP address in the logs of metric scrape is displayed as 10.244.0.0 which is the entire Canal subnet that Iā€™m using for kubernetes networking. ipBlock of 10.244.0.0/16 in the NetworkPolicy object also works, allowing access, but it is not much better. Specifying 10.244.0.0/32 (the IP address in the logs) does not allow access.

Note: I was just looking for link to Canal page on Calico web site but it disappeared. Similarly, any mention of Canal also disappeared from the kubernetes website. Iā€™m guessing I should read it as that itā€™s deprecated. Anyway, at this stage I do not know if this is networking provider related or now, my current guess is that itā€™s not.

Below is how to reproduce this issue from scratch, that is without reliance on the dashboard. Any ideas how to resolve it are greatly appreciated.

# Create namespace where all our test objects will reside. At the end we delete the whole namespace with everything in it
kubectl create namespace testing

# This is the service account that we will give access to api server proxy to
kubectl create serviceaccount -n testing curl

# This role describes access to api server proxy for our nginx service, port 80
kubectl create role -n testing curl --verb=get --resource=services/proxy --resource-name=nginx:80

# Bind the account and the role above together
kubectl create rolebinding -n testing curl --role=curl --serviceaccount=testing:curl
 
# Create a nginx server
kubectl run nginx -n testing --image=nginx --labels="app=nginx" --generator=run-pod/v1

# Create a curl container to test connectivity from
kubectl run curl -n testing --serviceaccount=curl --image=curlimages/curl --labels="app=curl" --generator=run-pod/v1 sleep 999999

# Create service that exposes nginx (mainly for discovery purposes)
kubectl create service -n testing clusterip nginx --tcp=80:80

# Try accessing nginx from the curl container - it works
kubectl exec -n testing -it curl -- curl http://nginx

# And also via api server proxy - it works
kubectl exec -n testing -it curl -- sh -c 'curl -k "https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT_HTTPS/api/v1/namespaces/testing/services/nginx:80/proxy/" --header "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"'

# Look and nginx logs. Note, how the source IP for the api server proxy call. In my case it's x.y.z.0
kubectl logs -n testing nginx

# Let's create and apply NetworkPolicy objects for the pods we just created
cat <<EOF > network-policy.yaml
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: nginx
  namespace: testing
spec:
  podSelector:
    matchLabels:
      app: nginx
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: curl
    ports:
    - protocol: TCP
      port: 80
EOF

kubectl apply -f network-policy.yaml

# At this stage straight nginx call is still working:
kubectl exec -n testing -it curl -- curl http://nginx

# But api server proxy call no longer does
kubectl exec -n testing -it curl -- sh -c 'curl -k "https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT_HTTPS/api/v1/namespaces/testing/services/nginx:80/proxy/" --header "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"'

# clean up
kubectl delete namespace testing

I seem no longer to be able to edit the post above, so adding this in a new one:

I removed canal and installed calico instead, with the same cidr of 10.244.0.0/16. Iā€™m still having the same issue.

Is there anything missing from this question or unclear? Is there any other way to get communitiy help? I also tried slack (same result) and as I understand github is for issues only, not for support. Thank you in advance!

I have managed to solve it a while ago after more research and help on calico forums. Did not post this here since the deafening silence was really demotivating, but I hope that it will help the next person stumbling upon this problem.

The solution was to whitelist calico ip addresses for each node (since the request can come from any of them):

# Let's create and apply NetworkPolicy objects for the pods we just created
cat <<EOF > network-policy-template.yaml
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: nginx
  namespace: testing
spec:
  podSelector:
    matchLabels:
      app: nginx
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: curl
    ports:
    - protocol: TCP
      port: 80
  # Access via kubernetes proxy
  - from:
    - ipBlock:
        cidr: $MASTER1/32
    - ipBlock:
        cidr: $MASTER2/32
    - ipBlock:
        cidr: $MASTER3/32
    ports:
    - protocol: TCP
      port: 80
EOF

# Get Calico ip address for each node
export MASTER1=$(kubectl get node master-01 -ojson | jq -r '.metadata.annotations."projectcalico.org/IPv4IPIPTunnelAddr"')
export MASTER2=$(kubectl get node master-02 -ojson | jq -r '.metadata.annotations."projectcalico.org/IPv4IPIPTunnelAddr"')
export MASTER3=$(kubectl get node master-03 -ojson | jq -r '.metadata.annotations."projectcalico.org/IPv4IPIPTunnelAddr"')

# put the addresses to the policy file and apply policy
envsubst < network-policy-template.yaml > network-policy.yaml
kubectl apply -f network-policy.yaml

# Now the api server proxy call works
kubectl exec -n testing -it curl -- sh -c 'curl -k "https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT_HTTPS/api/v1/namespaces/testing/services/nginx:80/proxy/" --header "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"'

Here are threads that I opened on SE and Calico forums that ultimately led to this solution:

The .0 concern in my initial post turned out to be a red herring. In Calico CIDR itā€™s just a normal ip address. It does not necessary has to be .0 it just happened in my case that Calico create one like that. Iā€™ve seen it create not .0 addresses for nodes since then.

Iā€™ll copy the explanation from that other thread I linked above, as to why these IP addresses are necessary:

The Kubernetes API Server runs in the hostā€™s network namespace (either as a pod, or just as a binary, depending on the distro). It isnā€™t a regularly networked pod with its own per-pod IP address.

When a process in the hostā€™s network namespace (API Server or any other process) connects to a pod, Calico knows it needs to encapsulate the packet in IPIP before sending it to the remote host. It chooses the tunnel address as the source so that we ensure that the remote host knows to encapsulate the return packets. In IPIP mode, the underlying network doesnā€™t know what to do with packets that have pod IP addresses on them, and might drop them. So, by encapsulating we ensure the return packets are delivered.

The IP addresses that I got from node annotations above can also be obtained by running calicoctl for specific nodes:

calicoctl get node master-01 -ojson | jq -r '.spec.bgp.ipv4IPIPTunnelAddr'
2 Likes

@zespri thanks for your diligence and sharing your findings with the community!