Volume: is there a way to mount a read-only volume from a repo image?

I’m working on a small project that would contain a simple web server hosting some static files, such as pictures, for a parent project/application.

Currently, I include the picture files along with the web server into a single container image.

However, I realized that the picture files need to be updated for different parent projects or just a different rev while the web server itself is very simple and would remain unchanged. Therefore, I believe it would be better to split the container into a container hosting only the web server and another entity providing the static files.

I feel ConfigMap is a good option. However, ConfigMap is not friendly for binary data, and it has a size restriction (I’m using helm, and not sure if the size restriction comes from Kubernetes or Helm.) Even without the size limit, since all config data is encoded and put into the yaml file, it won’t be good to pass a huge config data through ConfigMap.

I looked at the type of volumes. Cloud storage isn’t an option for me. I don’t like the idea of using local or hostPath because setting up the local or hostPath, to my understanding, is outside of the Kubernetes’ scope. For a similar reason, remote filesystem options, are better to be implemented in a container, and then it’s even more complicated than just adding the picture files into the web server container.

Since I’m still new to Kubernetes, I don’t know if I’ve browsed all possible options. How would you implement this small project?

I’m think the best is to add a new feature so Kubernetes can pull/download and mount the image containing picture files as a volume. The image would be, just like the container image, stored in a repo. Do you agree? Look forward to your comments. Thanks!

https://github.com/kubernetes/git-sync

Yes! This seems a promising solution that meets my needs. I’ll give it a try.

On the other hand, is there a particular reason making it a sidecar rather than a built-in feature of Kubernetes? It appears to me that the sidecar solution is a little more flexible at the cost of sidecar’s overhead and the system complexity, even though the cost is insignificant in most cases.

Anyway, thanks a lot for the info, and I’m to try it out now :slight_smile:

The number of tools needed to do something like git clone is non-trivial. The git-sync base image includes git and ssh. git requires a shell. SSH has lots of dependencies. We don’t want to package that all into kubernetes when a small number of users need it.

This is exactly what sidecars are good for :slight_smile:

I was too excited to realize the “git” in the project name truly means git-only :frowning:

Although it has been tested as a working solution to my project, my initial intention is to “sync” a docker/OCI repo rather than a git repo (so I don’t have to set up a git repo in addition to the docker/oci repo). Are you aware of any similar projects that pulls a docker image registry instead?

I was hoping Kubernetes may have such a feature to pull an image file from the docker/oci registry and directly mount the image to a volume instead of deploying the image. As said, this would be similar with ConfigMap, but support big binary data from a separated repo image. Would you consider this as a good feature request?

my initial intention is to “sync” a docker/OCI repo rather than a git repo

Are you aware of any similar projects that pulls a docker image registry instead?

I am not aware of any tool that literally does this, but that doesn’t mean they don’t exist. At the end of the day, an OCI image is a set of tar files, right, so it should be possible to build something like this.

I was hoping Kubernetes may have such a feature to pull an image file from the docker/oci registry and directly mount the image to a volume

Nope. We have talked about this a LONG time back, but we do not have it, and I would be against baking it into kubernetes. This is what a sidecar is good for.

Would you consider this as a good feature request?

I would not put this into git-sync. It’s only superficially similar, but almost all of the implementation is different. If this is something you want to tackle, I’d be happy to see you fork git-sync (oci-sync?) and make a new tool. There are a few things to consider:

  1. git-sync depends on exec() to git to do most of the heavy lifting. if there’s a CLI tool to fetch and unpack an OCI image into a directory, it will be easy to adapt. If not, somewhat harder. I found this, which looks promising: https://github.com/opencontainers/image-tools/blob/master/man/oci-image-tool-unpack.1.md

  2. Just like git, I expect auth to be the hardest part of the problem

  3. I encourage you to retain the atomic-update semantics that I baked into git-sync, even though they are a little complex.

I was considering having Kubernetes provide such a feature because Kubernetes has everything (e.g. authentication, pull policy, etc.) to download OCI images already, so there is no need to repeat what Kubernetes is already capable in a sidecar. Certainly, I also agree this is not an essential feature (supporting large binary configuration data).

I played with Nexus OCI repo today, and realized an OCI images is not a simple tar file. As you mentioned, an OCI image is a set of tar files and each one is representing a layer, which is part of the docker image. Luckily, a simple docker image (from scratch) can be made to have only one layer, which makes the task much easier.

For now, Nexus is used to host the helm chart repo as well as OCI repo. A benefit of Nexus is it supports different type of repos. After OCI, I also considered rpm packages to keep the static files. However, since most base docker images doesn’t have rpm utilities built-in, it’s not a good option in my opinion.

Another option with Nexus is the “raw” repo, which is pretty much like an http file server.

So, for now, I have three options, git-sync to retrieve the files from a git repo, curl to retrieve the OCI image from a Nexus oci repo, and curl to retrieve the tarball from a Nexus raw repo.

Creating oci-sync to support a generic oci repo/image would be ideal, and I’d believe majority of the code can be extracted from Kubernetes, which should be the best. However, I’m not sure when I would be able to actually work on this.