I have a use case in which front-end application in sending a file to back-end service for processing. And at a time only one request can be processed by backend service pod. And if multiple request came service should autoscale and send that request to new Pod. So I am finding a way in which I can spawn a new POD against each request and after completion of processing by backend service pod it will return the result to front-end service and destroy itself. So that each pod only process a single request at a time.
I explore the HPA autoscaling but did not find any suitable way. Open to use any custom metric server for that, even can use Jobs if they are able to fulfill the above scenario.
So if someone have knowledge or tackle the same use case then help me so that I can also try that solution. Thanks in advance.