I’m just getting back to kubernetes after a few years hiatus and loving it. I’m now in the process of moving some containers from an old Docker Compose cluster to a k8s cluster and there’s one that I’m not sure the best way to move it over.
The container itself manages jobs it receives off an Azure queue. Once it receives a job it:
- Downloads a bunch of assets that it will need to process
- Spawns another container using the hosts docker instance and an image specified by the job
- Configures the new container to share a volume on the host so they can pass files
- Once the job completes manager grabs the results, posts to server, and cleans up the container
The manager can run a number of jobs in parallel. Right now the jobs are run by the manager talking directly to the docker instance on the host server.
One other important aspect is that the images for running the containers can be quite large - up to 1GB (I know, gross, but third party dependencies…). Right now the manager will ensure all of these images are pulled and ready to go on the docker server to make sure jobs process quickly.
What would be the best way to translate this to Kubernetes? So far my best idea is to put the manager together with a docker-in-docker container in the same pod and run jobs in that container. Downsides to this is that jobs can’t make use of more than one node, and that each time the docker-in-docker pod gets recreated it will need to download 10+GB of images before it’s ready to run.
Is there a better way?