An "update on restart" behavior


#1

I’d really like to be able to kubectl apply changes to my configuration, but have pods continue to run until they restart. My application has direct, stateful interaction with users, and so I want to be able to wait for explicit user action for rolling out an update.

My question is: Is there a simple way to implement this? Below I detail some of my tribulations while trying to do it myself, but I’m happy to take a completely unrelated solution.


StatefulSet is a good fit here – I want one pod per user/application instance. If I set the StatefulSet updateStrategy to OnDelete the behavior is closer – it waits for the Pod to be deleted to perform an update. If a user stops their application, however, the Pod restarts the container without receiving an update. Presumably this is because the Pod was never deleted. I then set the template.spec.restartPolicy to Never, in order to have the Pod stop and trigger the OnDelete behavior of the StatefulSet. Unfortunately there’s an undocumented special case that causes an error:

The StatefulSet "bukzor-orange-us" is invalid: spec.template.spec.restartPolicy: Unsupported value: "Never": supported values: "Always"

The devs discussed allowing restartPolicy:Never but decided against it, for strictly aesthetic and non-technical reasons, from my perspective.

The docs do mention .spec.template.updateStrategy.type:OnDelete, which sounds like it might be what I want, but that seems to be a typo:

I don’t make use of the StatefulSet volumeClaimTemplates (all of my volumes have a larger scope than the StatefulSet), so switching to Deployment is an option, but the update strategies consist of two ways to restart immediately, as far as I can see.

It seems like I should be writing my own custom pod controller at this point, but I haven’t gotten to that level of kubernetes yet, and I’m afraid it will be a large time investment. Is there a k8s-controller framework that makes custom controllers easy-ish, and is production ready? Honestly, my custom controller might be exactly StatefulSet with an allowance for restartPolicy:Never.


#2

Your own controller would likely be the best path, but a job+some logic might foot the bill at least in the interim. You could spin up a job per user, and the job ‘completes’ when the pod has terminated gracefully.

That would allow it to still recover in the event of a failure, and updating would essentially be firing off a new job at the updated pod spec.


#3

Thanks! It seems like I haven’t explained my purpose well enough. I’ll try again :slight_smile:

My goal is to decouple my updates as developer/maintainer and any changes in behavior for the user. The intent is that they hit “restart” and in a few seconds get a nearly identical application, but with bugs fixed. Many of my users are in India, so I would be asleep at this time. This is why I’d like to have the “next version” already applied to kubernetes, then let the actual rollout be driven by the users’ actions.

I don’t see how to do that with a Job, but I’m quite interested.


#4

Sorry, I should have expanded more. I don’t think you’re going to be able to do what you want natively without some level of communication between your application and kubernetes itself. I don’t think you’ll to go the full controller/CRD route, but it will need to communicate with it.

TBH thinking about it, the job idea would not net you much. Either way you need something to some action:

  • Statefulset - Delete the pod to trigger OnDelete when the user is done.
  • Job - Spin up job based on new version when user requests their instance spun up.

#5

I agree, I’ll need to write a bit of logic myself. I still feel a bit salty about the StatefulSet.spec.template.restartPolicy:Never limitation though; it seems like they’d have to add code to make that an error.

Perhaps my “controller” can be as simple as a Finalizer though? When finalizing pods of my application, create a replacement with the new, improved configuration. Thinking about all the edge cases though, I probably need to manage the entire lifecycle.

Do you know of any helpful code that I can leverage for this? Or should I just directly use the golang k8s client? I know Python really well, Go not at all, but I’ve been meaning to learn.