Approach to running sibling containers on demand inside a pod?

Suicas · January 20, 2022, 8:51am

I’m looking to port an application that currently runs via Docker compose to Kubernetes.

This container takes some input, then launches a number of different sibling Docker containers to do work depending on the input, then cleans them up and returns the result.

It uses the Docker engine API, and talks directly to the Docker daemon by mounting /var/run/docker.sock into the container, using the Go Docker SDK.

I’m wondering what approaches I can take to port this application to Kubernetes? It basically needs the application running in a pod to launch and manage the lifecycle of additional containers within the pod itself.

Running a Docker daemon inside a pod is one approach I considered looking at, but not sure if this is a good idea, or will have any problems (performance or otherwise).

If it helped, I could probably port the orchestration code to use containerd or CRI-O or whichever container runtime Kubernetes uses, as long as the application running inside the pod can still manage the lifecycle of additional sibling containers using it.

mrbobbytables · January 20, 2022, 1:14pm

Yeah, its generally not considered a good idea, it works but it also requires running with much higher elevated permissions etc.

Can it spin up and manage additional pods instead? That would be the preferred approach.

Suicas · January 20, 2022, 2:24pm

Probably not without a fair bit of work unfortunately. All communication between containers is currently done via mounted volumes. The root container roughly does the following:

looks at HTTP/JSON request it receives
creates a temporary directory for the request
writes input to process to the temp dir
launches 1+ sibling containers with the temporary directory mounted
sibling containers process any data in the mounted dir, writes output there
root container gathers the results from the temp dir and returns them
root container cleans up temp dir, ensures all sibling containers have been terminated/removed

The data shared between the root/sibling containers is potentially quite large, so having all the sibling containers run on the same host/filesystem as the root is important for performance.

Maybe it’s still possible with Kubernetes, but not sure if there’s a sensible way of doing it, or whether it’s just not a good fit for it.

xavi · January 20, 2022, 4:25pm

Depending on “how” the “main container” takes input, you may use Kubernetes Jobs to run the “processing containers”; parameters and/or input can be passed to the Job’s pod via environment variables, config or volumes, and collected (depending on what they “output”) the same way…

It can be a “quick and dirty” solution… But maybe exploring Knative or OpenFaaS may provide more control.

Best regards,

Xavi

Theog75 · January 20, 2022, 7:24pm

I would use an HPA (horizontal pod autoscaler) with custom metrics to scale the deployment up and down.

see more about HPAs

and here’s the bit about custom metrics for HPAs

Suicas · January 21, 2022, 11:49am

Those both look like good solutions, though look like they’d require a significant change to the way the app works.

I’m ideally aiming to avoid that - it’s quite a large chunk of complex code that interacts with the Docker API heavily, and has been stable in production for a long time, so we’d be reluctant to significantly rewrite anything (switching from using the Docker API to containerd or similar API is probably as far as we’d go).

Ideal solution would be to keep as much of the application running without significant changes if possible, but if Kubernetes isn’t really a good fit for it as it, that’s no problem either, it’s not like we’ve hit any issues running it in Docker anyway.

Theog75 · January 21, 2022, 12:18pm

On a different perspective - If your service is exposed to the internet - if an attacker was to somehow get elevated permissions the can interact with all the container running on that node (if you’re running on k8s or just a host with docker on it)

Suicas · January 21, 2022, 2:52pm

Yeah, that’s definitely true. Luckily in our case, the application is internal only, so input is treated as trusted.

Topic		Replies	Views
Run pods without Docker container or "lightweight" alternative General Discussions	4	4679	August 6, 2019
Advice on moving a job from Compose to Kubernetes that spawns other containers General Discussions	0	655	July 22, 2018
Simple sidecar application General Discussions development	6	10680	July 25, 2022
docker kubernetes General Discussions	9	1019	February 28, 2019
Using Kubernetes for CI General Discussions	1	885	November 24, 2019

Approach to running sibling containers on demand inside a pod?

Related topics