Ingress address individual pods in stateful set

Cluster information:

Kubernetes version: 1.21
Cloud being used: AWS EKS
Installation method: AWS EKS
Host OS: Amazon Linux 2
CNI and version: AWS VPC CNI (latest with EKS 1.21)
CRI and version: Whatever is supplied with AWS EKS 1.21 on their latest Amazon Linux 2 AMI

Greetings all.

I have a pod that I want to autoscale with the HPA, but I would like to be able to address each pod explicitly and individually via a ingress (I’m using Nginx Ingress Controller to terminate TLS and some other sugar).

It has been suggested that a stateful set is most correct here (not a deployment) because I can still use the HPA and each pod is named with an incrementing ordinal. I have a working solution where I create a service for each pod ordinal in the stateful set using statefulset.kubernetes.io/pod-name in the service selector, and then in my ingress I have a host rule defined for each service.

This seems somewhat overly verbose to me since the stateful set is able to make StatefulSet-ORDINAL.ServiceName.Namespace resolve to the appropriate pod using a single service rather than N (where N is the maxReplicas that the HPA might scale the stateful set up to).

Is there some way I can route traffic from my ingress to each pod individual pod in the stateful set without having to create a service for each individual pod?

This is a simplified version of the Helm template that I am using to generate my current solution:

{{- $statefulSetReplicas := .Values.autoscaling.maxReplicas }}

{{- range $e, $i := until $statefulSetReplicas }}
{{- $fullyQualifiedHostname := printf "%s-%d.%s" "myapp" $i $.Values.global.domainName }}
{{- $ordinalName := printf "%s-%d" (include "foo.fullname" .) $i }}
---
apiVersion: v1
kind: Service
metadata:
  name: {{ $ordinalName }}
  labels:
    {{- include "foo.labels" $ | nindent 4 }}
spec:
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  selector:
    {{- include "foo.selectorLabels" $ | nindent 4 }}
    statefulset.kubernetes.io/pod-name: {{ $ordinalName }}
{{- end }}

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "foo.fullname" . }}
  labels:
    {{- include "foo.labels" . | nindent 4 }}
  annotations:
    {{- with .Values.ingress.annotations }}
      {{- tpl (toYaml .) $ | nindent 4 }}
    {{- end }}
spec:
  ingressClassName: "nginx"
  tls:
    - secretName: {{ .Values.ingress.tls.secretName }}
      hosts:
{{- range $e, $i := until $statefulSetReplicas }}
{{- $fullyQualifiedHostname := printf "%s-%d.%s" "myapp" $i $.Values.global.domainName }}
        - {{ $fullyQualifiedHostname | quote }}
{{- end }}
  rules:
{{- range $e, $i := until $statefulSetReplicas }}
{{- $fullyQualifiedHostname := printf "%s-%d.%s" "myapp" $i $.Values.global.domainName }}
{{- $ordinalName := printf "%s-%d" (include "foo.fullname" .) $i }}
    - host: {{ $fullyQualifiedHostname | quote }}
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: {{ $ordinalName }}
                port:
                  name: http
            {{- end }}
{{- end }}

I don’t think services were designed with the idea to route to an individual pod like this. The reason headless services exists is to enable coordination of resources intra cluster, but exposing pods like this seems like a bit of an anti-pattern. Two questions that I would stop to ask before implementing this would be.

  • Why would you want to expose individual pods like this?
  • Do you need to expose pods outside of the cluster like this? (Perhaps you could give the thing accessing them, intra-cluster access so that it can reach the headless service, with maybe an openvpn service or something)

Hi @protosam!

Indeed, this was my initial thought as well – it doesn’t feel correct to me either.

The use case is for an RTC coordinator service where clients on the Internet will need to talk to the same instance if they want to share the same media stream. Because of this they need to be exposed to the public, and via an ingress to provide the user-friendly URL, TLS, authentication shim etc.

Ingress | Kubernetes hints that I could possibly do something like this as a backend rule:

http:
  paths:
    - path: /
      pathType: Prefix
      backend:
        resource:
          apiGroup: core
          kind: Pod
          name: my-pod-name

Kubernetes does accept this as a valid configuration and querying the ingress looks sane:

  Host                                    Path  Backends
  ----                                    ----  --------
  my-pod-name-0.domain.name
                                          /   APIGroup: core, Kind: Pod, Name: my-pod-name-0
  my-pod-name-1.domain.name
                                          /   APIGroup: core, Kind: Pod, Name: my-pod-name-1
  ...
  ...

However, the Nginx Ingress Controller simply returns a 503 when testing. I’m guessing Nginx Ingress Controller doesn’t support this perhaps? Looking at the configuration on the controller pod I see that set $proxy_upstream_name is being set to something non-sensical like it’s expecting a service backend.

In services itself, there is sticky sessions that work based on IP.

Have you tried out service.spec.sessionAffinity and service.spec.sessionAffinityConfig.clientIP.timeoutSeconds from the ServiceSpec?

You would set the sessionAffinity to ClientIP and set timeoutSeconds to some integer between 0 - 86400 (1 day). The default is 3 hours.

Actually I just realized coordination would still be an issue. So yeah, this would probably be best solved via an ingress controller.

You might be able to use the built in rules matching and middlewares in Traefik v2 to get to headless services, but I’m not sure what that would look like.

@nicolaw , have you solved it this way?