Best practice to work nimbly with changing codebase

Hi all,

I’m new to k8s, running a live cluster, and I’m finding it cumbersome to update the code run by my containers. I’d like to find out if there are some best practices folks use to make this more seamless.

In summary, my use case is that I have a few different containers that are run as part of jobs that get spawned dynamically (triggered by changes in a database, which represent new requests). The codebase is all Python. While the general characteristics of those jobs are constant (things like their tolerations, volume claims, namespace, labels, etc.) the code that is executed inside the jobs is subject to frequent changes (bug fixes, new features, etc.) Whenever I update the code that’s supposed to be run inside one of the containers, my current workflow is to re-build the container and re-deploy it to my registry with an incremented version number. I then have to go around and increment the version numbers used by other parts of my system when starting jobs with this particular container to match the updated one. Incrementing the version number now means I need to re-deploy updates for all the containers that spawn the recently updated job. Since these jobs are themselves triggered by code that lives inside other containers, this means one downstream change sets off a chain reaction of required upstream changes to make everything work properly.

This seems to be a common use case, so I’m wondering what the best practices are to avoid this super inconvenient workflow.

For instance, one approach would be to have my containers always download the latest version of the required scripts at runtime (rather than putting in the scripts at container build time). However, from quickly Googling it seems like this approach isn’t encouraged. Another hacky approach would be to keep a “version file” in a shared location, and have all scripts read the latest versions from that file, to use when they start other downstream jobs. However, this would still require a build & push to registry after every bug fix (which perhaps a Jenkins could help with) and then remembering to increment the corresponding number in the version file for the newly built container. It’s better than the current state, but still seems like a lot of steps and prone to human error.

Any help / direction appreciated!