I have a Kubernetes deployment that needs a large cache disk that can be shared between multiple pods of the same deployment. For performance reasons, the cache disk has to be ReadWriteOnce (RWO) and can only be attached to one node at a time.
I want to be able to schedule pods freely on multiple nodes and all pods on the same node should share the same cache volume. So in essence, my ideal deployment would attach one PV per node and have each pod of my deployment share the same PV (if they happen to be running on the same node).
I cannot find any way to achieve this with Kubernetes deployment options (ReplicaSet, Deployment, StatefulSet, DaemonSet).
The Deployment and Replicaset can each only reference a single persistent volume claim (that binds one persistent volume, that is bound to one node at a time). This means my deployment can never scale beyond a single node.
A StatefulSet will not work since every single pod will have a unique PVC (and thus a unique PV). This leads to the number of PVs in use being equal to the number of pods scheduled. This is not what I want either, since I want pods on the same node to share the PV (and thus improve each others cache hit rate).
A DaemonSet cannot work since it will reference a single PVC (with a single PV) that will be bound to one node at a time. This results in only a single pod being scheduled successfully. All pods on the remaining nodes can’t be scheduled because of the RWO nature of the storage.
The best ideas I have so far are to either have one Deployment per node (and do node scaling manually) or to use hostPath mounts and mount the cache manually to the host path of each node. Both solutions are not great for obvious reasons.
Is there anything I am overlooking about Kubernetes storage architecture?