I’m looking to port an application that currently runs via Docker compose to Kubernetes.
This container takes some input, then launches a number of different sibling Docker containers to do work depending on the input, then cleans them up and returns the result.
It uses the Docker engine API, and talks directly to the Docker daemon by mounting /var/run/docker.sock into the container, using the Go Docker SDK.
I’m wondering what approaches I can take to port this application to Kubernetes? It basically needs the application running in a pod to launch and manage the lifecycle of additional containers within the pod itself.
Running a Docker daemon inside a pod is one approach I considered looking at, but not sure if this is a good idea, or will have any problems (performance or otherwise).
If it helped, I could probably port the orchestration code to use containerd or CRI-O or whichever container runtime Kubernetes uses, as long as the application running inside the pod can still manage the lifecycle of additional sibling containers using it.
Probably not without a fair bit of work unfortunately. All communication between containers is currently done via mounted volumes. The root container roughly does the following:
looks at HTTP/JSON request it receives
creates a temporary directory for the request
writes input to process to the temp dir
launches 1+ sibling containers with the temporary directory mounted
sibling containers process any data in the mounted dir, writes output there
root container gathers the results from the temp dir and returns them
root container cleans up temp dir, ensures all sibling containers have been terminated/removed
The data shared between the root/sibling containers is potentially quite large, so having all the sibling containers run on the same host/filesystem as the root is important for performance.
Maybe it’s still possible with Kubernetes, but not sure if there’s a sensible way of doing it, or whether it’s just not a good fit for it.
Depending on “how” the “main container” takes input, you may use Kubernetes Jobs to run the “processing containers”; parameters and/or input can be passed to the Job’s pod via environment variables, config or volumes, and collected (depending on what they “output”) the same way…
It can be a “quick and dirty” solution… But maybe exploring Knative or OpenFaaS may provide more control.
Those both look like good solutions, though look like they’d require a significant change to the way the app works.
I’m ideally aiming to avoid that - it’s quite a large chunk of complex code that interacts with the Docker API heavily, and has been stable in production for a long time, so we’d be reluctant to significantly rewrite anything (switching from using the Docker API to containerd or similar API is probably as far as we’d go).
Ideal solution would be to keep as much of the application running without significant changes if possible, but if Kubernetes isn’t really a good fit for it as it, that’s no problem either, it’s not like we’ve hit any issues running it in Docker anyway.
On a different perspective - If your service is exposed to the internet - if an attacker was to somehow get elevated permissions the can interact with all the container running on that node (if you’re running on k8s or just a host with docker on it)