How to use initContainers with Jobs alongside two different strategies

piotrmasior · December 8, 2020, 11:00am

Hello

I have problem with understanding how it can be achieved, so little background first:

I have application requirements such us:

microservice is running in two different modes (achieved by injecting different configmaps for same build image), lest call them API(many instances) and WORKER(only one instance can be active in runtime)
because of problems with in-application locking mechanism, worker cannot guarantee that only 1 instance is actively processing some stuff and ordering is critical, so we cannot use rolling update strategy for this mode.
additionally we need to spawn job before each deployment that updates some persistent storage
if such job fails, old worker instance should stay active, if jobs succeeds recreate should happen

so what we came up with is:

for WORKER role we use recreate strategy (small downtime - 10s - is acceptable for this scenario)
for API role we stay with rollingUpdate
we added IntiContainers definition with condition for the Job to be finished so we can be sure that new code will not be rolled-out untill job is finished.

For happy path it works good. When job is taking longer - problem appears because we extend downtime significantly.
Another more critical problem is that because of recreate strategy, worker goes down first, and it is not even waiting for job result. If job fails we are ending up with WORKER down and someone needs manually get back to appropriate state what is unacceptable.

We think that solution can be like using: https://github.com/groundnuty/k8s-wait-for/ but we are not sure if that is correct path in this case or we are missing some other options.

Has anyone suggestions for us?

Cluster information:

Kubernetes version:
Cloud being used: (aws)
Installation method: custom

Topic		Replies	Views
Handling Jobs With Long Running InitContainers General Discussions development	2	1744	July 22, 2023
When I run "kubectl apply -f {manifest}", RollingUpdate does not work General Discussions development	2	937	May 23, 2023
Failed Job/Pod/Container troubleshooting General Discussions	5	7782	January 10, 2024
Strategy recreate Persistent volumes General Discussions	1	1976	February 8, 2024
Understanding the behavior or Recreate Strategy with termination grace period General Discussions	0	790	June 12, 2024

How to use initContainers with Jobs alongside two different strategies

Cluster information:

Related topics