Single threaded service load balancing/scaling

Tony_De_Keizer · April 14, 2022, 12:10am

Hi ,

This is a general question relating to a particular service we need to scale and access in a specific way.
The service itself is single threaded in that only one connection can be estaiblished to it at a time. Specifically it is a document generation service based around OpenOffice which can only process a single request at a time.
We need to manage a “pool” of these and hand out connections to other applications as required and hopefully scale it appropriately to cater for the no of concurrent connections required.
I am new to k8S and have done some reading of the available load balancing and scaling options but I cannot seem to find any reference to load balancing/scaling single threaded applications like these ?
Do we have to build our own front end load balancer app to support this or is there a way K8S can do this for us ?
Hopefully this makes sense and someone can assist.

Cheers
Tony

yogi · April 14, 2022, 8:12am

What is your expected behaviour in case you do not have any pods available to serve the request?
Example: 5 single threaded pods are service 1 request each. 6th request comes in, what should happen? request should get queued? How log the request should be queued for ? request should fail? will the consumer retry the request, in case of failure?

thockin · April 14, 2022, 11:23pm

You almost certainly want your own front-end “proxy” that receives N incoming requests and allocates them to backends. You’ll have to think about connection-lifecycle management (how do you know a backend is “available”) and how you handle overload (more incoming requests than backends) and scale-out (what if your front-end crashes or needs an update) and things like that.

There is no built-in support for this - it’s too domain-specific

Tony_De_Keizer · April 18, 2022, 1:05am

Thanks for the responses. Very helpful.
Looks like we will be building a front end proxy to deliver our solution.

csantanapr · April 20, 2022, 2:30am

You can try using Knative Serving with a concurrency of 1, you can configure it to only handle one concurrent request at a time, it comes with a queue-proxy sidecar container that will make sure the request is done before sending the next request to the user container, and scale more pods as need it based on overall traffic to the service

Topic		Replies	Views
Single threaded or Multithreaded application General Discussions development	0	4472	August 22, 2020
All the traffic is going to only single pod General Discussions loadbalancer , service , network	1	1954	April 22, 2024
Single Node Kubernetes in Production General Discussions	2	1777	October 24, 2018
Need to create a pod for each new request from frontend service in Kubernetes General Discussions development	1	2023	February 7, 2019
Service does not distribute traffic in a balanced manner General Discussions service	1	703	April 24, 2023

Single threaded service load balancing/scaling

Related topics