Docker registry created not accessible inside cluster

I am using a EKS cluster over aws
I have create docker registry as a deployment and then created a svc and an ingress over it
In the ingress , I have placed tls secrets for the ingress Host

spec:
rules:

  • host: xxxxxxxxx.com
    http:
    paths:
    • backend:
      serviceName: docker-registry
      servicePort: 5000
      path: /
      pathType: ImplementationSpecific
      tls:
  • hosts:

I have 4 worker nodes and a jump server
Issue I am facing is that I am able to access the docker registry on ingress address from the jump host but from worker nodes it is failing with error

request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

what wrong i am doing here ?
I have tried placing the service IP and the registry ingress host in /etc/hosts copying the certs to /etc/docker.certs.d/registryname .
Any hint would be great ?

Cluster information:

kubectl version o/p
Client Version: version.Info{Major:“1”, Minor:“20+”, GitVersion:“v1.20.4-eks-6b7464”, GitCommit:“6b746440c04cb81db4426842b4ae65c3f7035e53”, GitTreeState:“clean”, BuildDate:“2021-03-19T19:35:50Z”, GoVersion:“go1.15.8”, Compiler:“gc”, Platform:“linux/arm64”}
Server Version: version.Info{Major:“1”, Minor:“19+”, GitVersion:“v1.19.8-eks-96780e”, GitCommit:“96780e1b30acbf0a52c38b6030d7853e575bcdf3”, GitTreeState:“clean”, BuildDate:“2021-03-10T21:32:29Z”, GoVersion:“go1.15.8”, Compiler:“gc”, Platform:“linux/amd64”}

Cloud being used: AWS
Installation method: EKS
Host OS: amazon linux ami arm64
CNI and version: Not known
CRI and version: Not known

I ran into this myself recently. Does EKS use containerd? From what I know containerd doesn’t respect the CA certs until it’s restarted and won’t until version 1.5 propagates out.

Edit: when you see that error you need to dig further into the container engine logs to confirm it’s the cert.

I checked on one worker node to find the CRI , kubelet process is as below , so I think CRI is docker
/usr/bin/kubelet --cloud-provider aws --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime docker
but i did see both dockerd and containerd processes running on the worker node.

Also on checking the docker service logs I got same error.

Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-06-14 08:31:57 UTC; 4 days ago
Docs: https://docs.docker.com
Process: 12574 ExecStartPre=/usr/libexec/docker/docker-setup-runtimes.sh (code=exited, status=0/SUCCESS)
Process: 12571 ExecStartPre=/bin/mkdir -p /run/docker (code=exited, status=0/SUCCESS)
Main PID: 12579 (dockerd)
Tasks: 23
Memory: 116.5M
CGroup: /system.slice/docker.service
└─12579 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
Jun 19 02:23:45 ip-xxxxx dockerd[12579]: time=“2021-06-19T02:23:45.876987774Z” level=error msg=“Handler for POST /v1.40/images/create returned error: Get https://xxxx: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”

After thinking about this a bit and the experiences I went through, there’s two areas that you can focus on.

  • Does the registry work as it’s deployed?
  • Is the problem with the certificate being installed on all the nodes that pull from the registry?
  • Can the nodes actually reach the registry.

You could do testing locally with kubectl port-forward. If you’re using Docker for Desktop you will need to use the --address flag to use a network interface that the Docker VM can reach for push/pull. Also it’s just easiest to configure the interface as an insecure registry in the docker settings.

On the nodes, you can curl /_v2/ and /_v2/catalog, just to confirm that your registry image is working. If you have to add -k or --insecure to curl, I would work on the assumption the SSL isn’t installed correctly yet.

Also, here’s my registry deployment yaml. It’s a bit custom, you’ll see that I maintain a CA cert/key pair on my nodes and I generate SSL certificates adhoc in containers. The real take away is probably how I have things mounted.

---
apiVersion: v1
kind: Service
metadata:
  name: registry
spec:
  selector:
    app: registry
  ports:
    - protocol: TCP
      port: 443
      targetPort: 5000
---
# registry.ci.svc.cluster.local
apiVersion: apps/v1
kind: Deployment
metadata:
  name: registry
  labels:
    app: registry
spec:
  replicas: 1
  selector:
    matchLabels:
      app: registry
  template:
    metadata:
      labels:
        app: registry
    spec:
      volumes:
        - name: registry-vol
          hostPath:
            path: /var/lib/data/registry
            type: DirectoryOrCreate
        - name: cluster-shared-ca-vol
          hostPath:
            path: /etc/ssl/k8s
            type: Directory
        - name: cert-vol
          emptyDir: {}

      initContainers:

        # Utility that generates ssl certificate
        - name: generate-ssl-certificate
          image: alpine:latest
          imagePullPolicy: Always
          env:
            - name: K8S_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          workingDir: /certs/
          command:
            - /bin/sh
            - -c
            - |
              set -x -e
              date -u
              apk add openssl
              openssl req -subj '/C=K8/ST=Cluster/L=Pod/O=SnakeOil/OU=SnakeOil/CN=registry.'"${K8S_NAMESPACE}"'.svc.cluster.local' -nodes -newkey rsa:4096 -keyout /certs/registry.key -out snakeoil.csr
              openssl x509 -req -in snakeoil.csr -CA /etc/ssl/k8s/cluster-shared-ca.crt -CAkey /etc/ssl/k8s/cluster-shared-ca.key -CAcreateserial -out /certs/registry.crt -days 365 -sha256
          volumeMounts:
            - name: cert-vol
              mountPath: /certs/
            - name: cluster-shared-ca-vol
              mountPath: /etc/ssl/k8s

      containers:
        - image: registry:2
          name: registry
          imagePullPolicy: IfNotPresent
          env: # Ref: https://docs.docker.com/registry/configuration/
            - name:  REGISTRY_HTTP_ADDR
              value: 0.0.0.0:5000
            - name:  REGISTRY_HTTP_SECRET
              value: SNAIK-OIL-SECRET
            - name: REGISTRY_HTTP_TLS_CERTIFICATE
              value: "/certs/registry.crt"
            - name: REGISTRY_HTTP_TLS_KEY
              value: "/certs/registry.key"
            - name:  REGISTRY_LOG_LEVEL
              value: debug


          ports:
            - containerPort: 5000

          volumeMounts:
            - name: registry-vol
              mountPath: /var/lib/registry
            - name: cert-vol
              mountPath: /certs/

will try this certs part , I have installed certs at ingress level
I think its some issue with SG or NACL , because registry is accessible through the jump host which is in same subnet as the worker nodes
just within the k8s cluster pods and workers registry is not accessible

my deployment def is like this

> apiVersion: apps/v1
> kind: Deployment
> metadata:
>   name: registry-deployment
>   namespace: devops
>   labels:
>     app: registry
> spec:
>   replicas: 1
>   selector:
>     matchLabels:
>       app: registry
>   template:
>     metadata:
>       namespace: registry
>       labels:
>         app: registry
>     spec:
>       containers:
>       - name: registry
>         image: registry:2.6.2
>         volumeMounts:
>         - name: repo-vol
>           mountPath: "/var/lib/registry"
>         - name: certs-vol
>           mountPath: "/certs"
>           readOnly: true
>         - name: auth-vol
>           mountPath: "/auth"
>           readOnly: true
>         env:
>         - name: REGISTRY_AUTH
>           value: "htpasswd"
>         - name: REGISTRY_AUTH_HTPASSWD_REALM
>           value: "Registry Realm"
>         - name: REGISTRY_AUTH_HTPASSWD_PATH
>           value: "/auth/htpasswd"
>       volumes:
>       - name: repo-vol
>         persistentVolumeClaim:
>           claimName: docker-repo-pvc
>       - name: certs-vol
>         secret:
>           secretName: certs-secret
>       - name: auth-vol
>         secret:
>           secretName: auth-secret
SVC
> apiVersion: v1
> kind: Service
> metadata:
>   name: docker-registry
>   namespace: devops
> spec:
>   selector:
>     app: registry
>   ports:
>   - port: 5000
>     targetPort: 5000

INGRESS
> apiVersion: networkingk8sio/v1beta1
> kind: Ingress
> metadata:
>   name: registry-ingress
>   namespace: devops
>   annotations:
>     nginxingresskubernetesio/rewrite-target: /
>     kubernetesio/ingressclass: nginx
> spec:
>       tls:
>     - hosts:
>       - examplecom
>       secretName: tls-registry
>   rules:
>   - host: examplecom
>     http:
>       paths:
>       - path: /
>         backend:
>           serviceName: docker-registry
>           servicePort: 5000

I have created the certs for example.com and placed it in both the deployment secret and the secret used in ingress , also tried with same secretname in both
I suspect that if its a tls issue then docker login should not work from the jump server also , here it is working from jump server but not worker nodes
Or maybe it has to do something with dns routing ?
ingress host is not accessible from inside worker nodes
I ll give a try the way you are doing

I tried with port forward and it works on the worker node
but with it i cannot access it from jump server with workerip:forwardeport

docker login 127.0.0.1:49999
Username: myuser
Password:
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See

Login Succeeded

Still cant figure out why i cant use ingress