We have designed a micro service based system, where some kind of services run computational code, which can take ~0.5 to ~5 seconds each.
In front of each service there is a work queue so each service pulls the next message when it’s ready/free.
Deploying these micro services as Pods in K8s cluster, we wonder what will perform better on k8s:
Designing each micro-service as single threaded application, so each service handles exactly one computation where k8s handles concurrency and/or parallelism over the cluster (horizontal scaling).
Designing each micro-service as multi threaded application, so each service handles multiple computations where both the service handles concurrency and or parallelism inside it’s own application/Pod and k8s handles concurrency and or parallelism over the cluster (horizontal scaling)?
I know it’s a tricky question, because K8s abstracts away cluster resources, but what I know for sure is that if we had a Single Machine with 24 cores, 2 hyper-threads each, then it would be ideal to have a Single Process Service, which runs 48 threads in parallel, one per computation.