Rescheduling pod after scale up

The Kubernetes scheduler only checks the node’s load (among other things) before it schedules the pod on the more suitable node.

Once the pod is scheduled in one node, for the duration of its lifetime, it is bounded to that node (see Pod Lifecycle):

Pods are only scheduled once in their lifetime. Once a Pod is scheduled (assigned) to a Node, the Pod runs on that Node until it stops or is terminated.

So the answer to your question is “no”, as others had already mentioned: the pod will not be re-scheduled to any other node.

Maybe you would like to consider using an horizontal pod autoscaler (see Horizontal Pod Autoscaling) to increase the number of pods in a deployment if the CPU or memory usage goes over a threshold. So, if your application is heavyly used, new pods of your application will be created to share the load (assuming you have available resources on the existing nodes or you manually add new nodes to your cluster for new pods to be created).

So, in your scenario, when your application is stressed, the HPA will try to create a new pod; the scheduler will check if any of the nodes have at least your pod’s requested memory available (1024Mi). If a third node is available, assuming that the current nodes has a RAM utilization of 95% and they will not have 1024Mib avaliable to host the new pod, the scheduler will deploy a new pod on the third (empty) node. This will increase the number of pods from 2 to 3, resulting in one pod of your application on each node (it will not move one pod from from, say node 2 to node 3). This will distribute your application’s load between 3 pods, instead of 2 and will decrease the load on your application’s pods.

If the load then goes down, the HPA can also reduce the number of pods in your deployment.

Using the cluster autoscaler (see Automatically scale a cluster to meet application demands on Azure Kubernetes Service (AKS)), you can increase the number of nodes in your cluster automatically when they are stressed.

So, using a combination of an HPA for your application and the cluster autoscaler, you can have a fully “elastic” solution for both your app and your cluster.

2 Likes