Dynamic pod creation in kubernetes

I am building an API application in Python that will be used by large volume of users to perform their job request. The user uses API endpoint to submit their job request providing input values. They receive the JobID in response and the application runs the job in background. The user can poll and retrieve the result once the job completes.

Initial version of this application defines the fixed number of replicas and it supports number of multiple requests. However, as I increased the load, either the requests start getting fail or the application (gunicorn worker node) under the pod start failing.

I was thinking of creating pods dynamically for background job which I have never done in the past. I think it will help in scaling the application load as the pod should die once the input job finishes. However, I am not sure about how I will manage the pods (if the pod fail or container under the pod fails etc.). In addition, I will also be limit by number of pods I can create.

Can someone share their experience if they have built similar application which demands by large volume of requests? How to design such application as highly scalable and resilient?

Hi ali1:

If your API application is what needs scaling on heavy load, you may take a look at the Horizontal Pod Autoscaler.

For your “background job” pods, you may consider using Jobs. Jobs provide a mechanism to handle failures and retries (Handling Pod and container failures).

Best regards,