GKE - More than enough host resource capacity available and pods becoming unschedulable

#1

Hi,

I’m in the process of testing and capacity planning a K8 clusters on GKE. Currently, I have a single cluster with a 4 nodes configuration (32 vCPU’s, 200gb in total) and autoscaling set to on.

What I’ve found is that pods become unschedulable even when there’s more than enough capacity available - with only one host in the cluster provisioned with high cpu capacity at 68%. I’ve also limited the cluster not to expand beyond 4 nodes and it would seem the scheduler thinks it needs more node capacity and refuses to schedule the new requested containers.

Thinking I’m missing something important - like number of pods to specific host configuration (eg. n1-standard-8) or the resource monitor may be incorrectly reporting an incorrect resource status to the scheduler.

Can anyone share best practices with regard to pod capacity and host configuration (better to go with high memory or high cpu or combination of both) for auto-scaling clusters? I would like to maximum the number of allowed pods in the private network and allowable host resources given the GKE Quota limits.

Print screen can be found in this post - https://groups.google.com/forum/#!topic/gce-discussion/UC6__8AxPXE

I was running 153 pods, 76 disks, 90 services (combination of internal & public)

Regards,
Rustin

#2

Adding resourcelimits to the pods really helps the system understand how many resources it needs and where to put them, Managing Compute Resources for Containers - Kubernetes.

If you leave those options empty Kubernetes just kind of guesses at home much space they need.

As for hard limits there are 110 pods allowed per node, which can be changed using the --max-pods flag on the kubelet (not sure where in gke you’d change that though).

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
https://prefetch.net/blog/2018/02/10/the-kubernetes-110-pod-limit-per-node/

#3

Thanks for the tips, much appreciated :raised_hands: I didn’t add any resource limits to the container specs I provisioned. It’s definitely a very important element of the K8 system which I’ve overlooked. I’ll do the reading and apply the updated configuration and get back with results.

Cheers.

2 Likes
#4

Hope it helps, let me know how it works out :slight_smile:

#5

Hi macintoshprime,

Following-up after my last post.

I’m able to report that the scheduler & auto-scaler on GKE is working perfectly.

After adding resource request and limits to pods I provision and also selecting the appropriate machine-type (n1-standard-8) for the autoscaling cluster appropriate for my cpu to memory workload I get near perfect host resource allocation.

Scaling happens seamlessly and only when it needs to - resulting in a really well-compacted cluster.

I’ve just provisioned 147 pods on a 6 node auto-scaling test cluster with amazing results.

Name Status CPU requested CPU allocatable Memory requested Memory allocatable
gke-eu-west-n1std4-lssdx-default-pool Ready 7.9 CPU | 7.91 CPU | 24.91 GB | 27.87 GB
gke-eu-west-n1std4-lssdx-default-pool Ready 7.91 CPU | 7.91 CPU | 27.07 GB | 27.87 GB
gke-eu-west-n1std4-lssdx-default-pool Ready 7.88 CPU | 7.91 CPU | 21.47 GB | 27.87 GB
gke-eu-west-n1std4-lssdx-default-pool Ready 7.85 CPU | 7.91 CPU | 25.14 GB | 27.87 GB
gke-eu-west-n1std4-lssdx-default-pool Ready 7.05 CPU | 7.91 CPU | 23.34 GB | 27.87 GB
gke-eu-west-n1std4-lssdx-default-pool Ready 7.8 CPU | 7.91 CPU | 24.41 GB | 27.87 GB

2 Likes
#6

Awesome! Happy to hear its working out :slight_smile: