Help me understand the big picture around not wanting containers running inside of containers

This is a question for the designers of k8s in particular

However you may think of containers and use them, I (and my coworkers) have always used them as a way of solving “dependency hell” with native libraries, cuda versions, and incompatible python modules etc., something that would be a huge pain to resolve within the same system, but easy if you just have unique containers for each program. We have a lot of bespoke CLI tools that call each other like any other CLI program might (sharing files on the same filesystem via volume mounts!). We have just taken for granted that you can call a docker run just like you’d exec in shell.

Fast-forward to the post 1.24 world where this is no longer possible or at least not in the way it was before.

Now, I understand the security concerns, I really do; although most of these pods were Jobs that were not exposed outside the cluster, so I’m skeptical of the threat, anyway lets just say that I get it ok.

But why was there not an attempt to solve the security concern and normalize rootless containers instead of writing it all off as an anti-pattern. I think this shows a real lack of imagination. Just because this was abused by build systems and such doesn’t mean there are not valid use cases. Programs working on the same volume of data is a valid use case. Having to constantly move files around the network just because we wanted to avoid dependency hell frankly feels like an arbitrary restriction compared to bare metal. I’m currently trying to de-containerize all of our internal tools because of this. Not everything is a web service, sometimes you have to perform long running, resource intensive image or video or whatever manipulation with diverse tools. I’m considering switching some of my pods to ECS or raw ec2 commands.

So basically, other than security concerns, why do you see “nested” containers as an antipattern? Are containers not just very high level “functions” don’t you compose functions?

I don’t think nested containers are an anti-pattern, but the amount of privilege needed to achieve this pattern is significant.

In fact, it was almost exactly 10 years ago that we released LMCTFY which does sort of support nested containers, which is a pattern we use inside Google. Note that the UX is wildly different than docker pull and docker run.

It wasn’t well supported in the kernel to do delegate cgroups, but I still think it’s an interesting way to operate - an “administrative” container (managed by k8s) and then freedom to work within those bounds on your own.

Thanks for the background info, and thanks for all your work! I’m mostly just annoyed we went whole hog on this pattern because we didn’t have a real devops person to play Cassandra, but I swear trying to find help around the subject online is like trying to find help with evil mode from Richard Stallman

Anything that getsa done in OSS needs a champion. This particular topic doesn’t come up much, in my experience, and so it hasn’t really had a champion.