EKS Scaling advice needed

I am very new to k8s so bare with me…

I am using AWS EKS with RDB and Redis and I’m having a bit of an issue figuring out how to scale properly.

Currently I am running an app that basically consists of 3 deployments.

Deployment 1 is my front end deployment that has an nginx/laravel pod
Deployment 2 is my scheduler deployment that is running artisan schedule on a single pod
Deployment 3 is my worker pod that is running artisan queue which picks up redis jobs to process

Everything works great, I can run jobs and the scheduler deployment will pick them up. I can manually specify replicas on my worker deployment and each replica will pick up a job. Great.

What I am trying to accomplish though is the following:

I want to run all of the front end (deployment 1 and deployment 2) on a small ec2 instance (node?) and run my workers on much larger ec2 instances when redis jobs are in the queue.

Ideally, the “worker” ec2 instance will only spin up when Redis has jobs that need to be done. This is to save some bucks on the much larger instances that are needed to process the amount of data we need to process.

Once the worker ec2 instance spins up, it will start processing jobs and then automatically scale the worker pods to handle more jobs. If the initial worker ec2 instance becomes maxed out with (pods) work, it should spin another ec2 instance.

This should continue scaling in this way until the jobs that need to be finished are all done.

After that, we start to scale down the worker pods, then scale down ec2 instances until we are left with just the front end (much smaller ec2 instance).

Sorry if some if this is wrong/doesn’t make sense, I’m really fresh with all of this and i’m trying my hardest to wrap my brain around it all.

Can this be done out of the box? If not, what should I be reading about? Am I close? Waaaay off? Any help would be greatly appreciated! Thanks!

Ideally, you don’t pin your deployments to specific instances.

Kubernetes abstracts that and you just use the cpu/mem requests to specify what a deployment needs and kubernetes makes sure you have it (doesn’t matter which node it will be run).

This is the basic idea, and if you can work this way, you will see tons of benefits. And is way simpler, of course :-).

You can use HPA to scale pods, and cluster autoscaler to scale nodes. If you need to scale the pods based on a custom metric and not just cpu/mem, you should look also into using custom metrics.

And if only your redis workers need to scale, surely you can tell the node auto scaler to scale only that ASG (if your cloud provider is supported currently by the node autoscaler).

Does this make sense? Or am I missing something?

Oh, forgot to mention that if you need to pin deployments to nodes, you can do it using labels (there are docs in the kubernetes side).

Don’t hesitate to ask if you find any problems or if anything is not clear :slight_smile:

This makes total sense. Thanks!

Right now i’m trying to figure out the best way to set up the custom metrics for redis jobs. I keep coming across a lot of articles about Prometheus, would you say that is a common approach or is there something simpler to just query the redis server directly for number of jobs?

Thanks for the help!

I never used custom metrics, I don’t really know :-/

But can’t you use cpu anyways (if the workers are cpu bound)? For example, if cpu I above 80% por some time, doesn’t it mean you are having tons of jobs and your workers are very busy? And if cpu goes down, you can downscale?

So, you can scale on CPU and just use that metric. It might be better than jobs size, as each job may take quite a different time to run, do you can’t really know how many workers you need.

If you just observe that it is in heavy load and add/remove pods accordingly, it may even work better.

What do you think? Or it is not CPU bound your workers?