HPA based on readiness probe?


I have been playing around with kubernetes hpa lately and have a use case I could not find a solution for.
One of my services can process only one task at a time, once it start processing some data it will not take any other requests. I can start as many of this service as needed. Currently we start a static amount of replicas for it but I am looking for a way to have kubernetes scale it automatically as it runs out of available instances.
Problem is that I cannot use default cpu metric to determine if a pod is busy or not as it might not have sufficient load to trigger the autoscale, depending on the data it is processing.

One idea I had was to use the readiness probe. I have a command I can run in each pod to check if it is busy or not. However, I could not find a way to tell the hpa to use only this information to start additional replicas? For example: start 40 to 120 replicas, start to scale when available pods (readiness probe returning true) gets below 10% ?

Any ideas?


Hey, I’ve been working on a way to write your own autoscalers in Kubernetes in an easy way, it’s still pre production ready but maybe it might help you out for your issue? The repo is https://github.com/jthomperoo/custom-pod-autoscaler I’m not sure if this is what you’re looking for, if not sorry for wasting your time!

Hi, sorry for late reply. I have been focusing on other tasks in the meantime, but will definetly have a look as soon as I am back on this topic, thanks !