I’m trying to understand CSI architecture, using kubernetes-csi/csi-driver-host-path as a reference. In particular, I’m looking to see what would be required to implement something like asteven/local-zfs-provisioner with CSI (it currently uses external volume provisioning).
My question centres around how the daemons on each node would best communicate with CSI.
I can see that in the csi-driver-host-path deployment, there are separate statefulsets for csi-hostpath-provisioner, csi-hostpath-resizer etc, and these pods have an affinity to run where the csi-hostpath-plugin is running. I can also see they communicate using a unix domain socket stored in a hostPath volume. The socket to use is passed as the -csi-address
flag to kubernetes-csi/external-provisioner etc, which in turn uses connection.Connect
from kubernetes-csi/csi-lib-utils, which supports various URL schemes as per grpc/grpc/blob/master/doc/naming.md.
(Aside: why use a hostPath volume for the socket, rather than putting all these components in a single pod which communicate via an emptyDir? This isn’t really important though)
My main question is this. How would you go about changing this so that it could provision volumes on multiple nodes?
Clearly, the lowest level hostpath-plugin can run as a DaemonSet across all the nodes, but I don’t know what the recommended way of deploying the CSI components would be.
One approach would be to replicate the CSI components on every node too: every node gets a csi-hostpath-provisioner, a csi-hostpath-resizer etc. They can all communicate with the hostpath-plugin using the Unix domain socket on that node easily. However, they would all be watching the same PVCs in the API, so would have to race against each other to decide which one picks up a particular PVC. That doesn’t seem right.
The other way would be to have a single, cluster-wide instance of the CSI components. This seems to make more sense. But then, how would those best communicate with the hostpath-plugin on each node? Does the k8s API provide some channel for this? Should the hostpath-plugin on each node expose its grpc endpoint as a “service”? If so, is it responsible for securing/authenticating connections over that service? Can RBAC be used to lock down access to these services?
Since the existing containers like kubernetes-csi/external-provisioner can only talk to a single fixed endpoint, it seems to me that in any case there would need to be some “middleware” container which knows where all the DaemonSet pods/containers are and can talk to them for provisioning. It could, I guess, even just ‘exec’ commands directly inside them. But what’s the standard way of doing this?
I apologise if the answer is obvious to someone with a better overview of k8s architecture than me!
Thanks in advance,
Brian.