Multiple Replicas of a Pod with a Large Dataset

We are running in AWS EKS and have one pod that has a large dataset. I create an AWS EBS volume with the dataset and mount it readonly in the container. I have an AWS EBS snapshot so that I can easily make more volumes.

I was not able to find any way for k8s to create the volumes for me, except for some alpha code that I won’t want to deploy to production, so I ended up coding my deployment like this:

spec:
  replicas: 1
    spec:
      nodeSelector:
        role: compute
        # zone must match the zone of the volume, below
        failure-domain.beta.kubernetes.io/zone: us-east-2a
      volumes:
        - name: templates
          awsElasticBlockStore:
            volumeID: vol-059350ca7d1231c8c
            fsType: ext4
      containers:
        - name: classifier
          image: "{{ manifest.classifier }}"
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /home/whatever/templates
              name: templates
              readOnly: true

That works OK but I can’t scale it because the volume ID is hardcoded.

I am looking for suggestions for a way to make this scale (other than having Ansible create more deployments, each with replicas: 1 and a hardcoded volume ID).

Why don’t you put the data in S3?

You can create a persistent volume with ReadOnlyMany access mode, and use a PersistentVolimeClaim to mount it from your deployment.

1 Like

Unfortunately, AWS EBS volumes don’t support ReadOnlyMany