I have a consumer workload that consumes messages from a Kafka topic. The objective is to autoscale the consumer workload to consume the Kafka topic as efficiently as possible. That is, I want to minimize the Kafka consumer offset lag at any point in time (goal is an offset of 0).
I’d like to autoscale my workload based on a value proportional to the offset lag and a value proportional to the rate of change of the offset lag. Generally, I’m interested in implementing a PID controller to manage the autoscaling controller.
I know that Kubernetes does not support this today and can only autoscale based on a value proportional to an observed metric. Specifically:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
What would be the communities recommended way today to implement an autoscaling control algorithm within Kubernetes to support this?
Is there any appetite within the community for a more configurable HPA spec that allows custom autoscaling algorithms? Or maybe the HPA spec could just use the value of metric for the target replica count. Is there any work being done in this regard? The standard algorithm today is very opinionated. I’d like to see support for the ability to feed the Kubernetes HPA autoscaling controller a replica count that can be governed by a custom algorithm.