Custom Deployment/Statefulset Behavior Advice Request

justoon · April 15, 2020, 12:53am

Kubernetes version: 1.16.8
Cloud being used: AWS
Installation method: Kops
Host OS: Debian
CNI and version: Calico 3.9.3
CRI and version: Docker 18.09.9

All of our applications follow a similar lifecycle - containers are built via CI/CD process and installed/updated via Helm charts.

We have long-running applications that stream data and can either act as a server (direct interaction with an end-user) or client (send data to a relay/repeater/another stream). These apps have a primary instance and backup instances. This is done for several reasons - redundancy and reliability and mixed configuration: the primary can stream high quality while the backups stream low quality data or different formats depending on the use cases.

Deployments and Helm charts have gotten us 90% towards our goal with one exception: when updating the applications to the latest version or due to a configuration change our requirements dictate a need to update the backup instances first, wait for them to become ready and then roll the primary instance.

Since each instance of the application has its own configuration we can’t utilize this as a ReplicaSet. We looked at implementing a Stateful Set, but in some cases application instances have side-car containers to do additional data processing and others do not, so the set definition does not work. There is also an added complexity that the instances can be deployed to multiple clusters in different AWS regions which also complicate the implementation of a StatefulSet.

We’re investigating using the Operator Framework to bridge the gap, but we’re not sure that’s the right path to go down. There are also several implementations: Kudo, Metacontroller, Operator SDK, etc. it’s hard to tell which one would be the right fit.

Does anyone have advice around how we should go about achieving our requirement?

Much appreciated.

justoon · April 20, 2020, 5:42pm

In case this helps anyone else facing this type of issue, we went with a custom init container that performs healthchecks against the backup pods. Combine this with the RollingDeployment update strategy, this effectively allows the primary pod to continue to run while its replacement waits until all the backups are ready. Once the backups are ready, the primary continues the rolling update. In case of an issue, the primary pod will fail at initialization allowing an operator to remedy.

Topic		Replies	Views
An "update on restart" behavior General Discussions	4	2090	February 18, 2019
Rolling Upgrade of StatefulSets with Zero-Downtime for 3 or more replicas General Discussions	4	3451	January 15, 2019
Why deployment need replicaset, but daemonset and statefulset don't need General Discussions development	1	827	December 28, 2021
Can we consider using containers (and kubernetes) for monolith, stateful web applications? General Discussions	3	2846	June 18, 2018
Feedback for adding maxUnavailable to StatefulSet General Discussions	0	1010	November 16, 2018

Custom Deployment/Statefulset Behavior Advice Request

Related topics