Google Cloud Engine: CPU quota exceeded for a simple cluster!?

giuliohome · May 1, 2022, 1:07pm

Hi all,
I’ve a question about the configuration of Google Cloud for Kubernetes.

First of all, I wanted to try an example of observability from a blog post.
It was written for docker-compose so I adapted it to kubernetes, here below is my github repository. in particular the folder /local which is what I want to consider.

As you can see, I’ve already deployed it to Azure Kubernetes Services and it is all working fine.
My problem is that on Google Cloud I was not able to do the same because I was getting a limit on the quotas of CPU and Addresses in Use, which is very weird to me. My understanding is that they are not accepting quota increments on one side and on the other hand this is not a resource demanding scenario, I confirm it is perfectly running on Azure with a node pool of 2 nodes of standard_d2as_v5.
I can’t believe that it’s not possible to run the same pods on Google Cloud Engine, so there must be something I’m doing wrong.
I’ve also tried the autopilot cluster on GKE and the classical one, but in both case I had no luck and I ended up with the same issue of CPU and addresses quotas…
Please any suggestions? Thank you!!!

Cluster information:

Cloud being used: Google Kubernetes Engine
Installation method:Terraform

See, for example, the tf for the autopilot cluster

github.com

giuliohome/gcp-k8s-sql-tf/blob/c8a2a8fbbca08f5cc8559c93865305764e016c83/k8s/gke.tf

data "google_client_config" "default" {}







# GKE cluster
resource "google_container_cluster" "primary" {
  name     = "${var.project_id}-${var.cluster_name}-mytf"
  location = var.region
  
  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  // remove_default_node_pool = true
  // initial_node_count       = 1
  
  network    = "projects/${var.project_id}/global/networks/default"

This file has been truncated. show original

giuliohome · May 1, 2022, 1:33pm

For example this is the error message from terraform apply if I put the machine_type="n2d-standard-4" (inspired by the AKS situation) instead of my default="n2-standard-2" (which then would not be enough for the very same error when I need to scale to kubectl apply -f the deployment yaml of loki, prometheus, grafana pods…)


Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_container_node_pool.gke_node_pool: Creating...
╷
│ Error: error creating NodePool: googleapi: Error 403: 
│       (1) insufficient regional quota to satisfy request: resource "IN_USE_ADDRESSES": request requires '6.0' and is short '2.0'. project has a quota of '4.0' with '4.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=my-cloud-giulio
│       (2) insufficient regional quota to satisfy request: resource "N2D_CPUS": request requires '24.0' and is short '16.0'. project has a quota of '8.0' with '8.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=my-cloud-giulio., forbidden
│
│   with google_container_node_pool.gke_node_pool,
│   on gke.tf line 45, in resource "google_container_node_pool" "gke_node_pool":
│   45: resource "google_container_node_pool" "gke_node_pool" {

giuliohome · May 1, 2022, 6:35pm

Since I realize this is a Kubernetes general forum, let me also add here a more abstract question, that is not related to a specific cloud provider but only to a Kubernetes application as per this sample.

As I wrote in the beginning of this thread, the original application runs with docker compose on a laptop, so to say on a single cpu. So far so good but now - after translating it to Kubernetes via Kompose - why does the deployment of the same conceptual application (i.e. a would be trivial obervability demo) require so many CPUs (24) that they exceed the regional cluster quota of a well known provider like Google? (unless one reaches an agreement through sales contact, but that’s not the point here)
I’d love to hear from a Kubernetes expert if this container orchestration introduces heavy CPU overhead “per se*” or I’m doing it wrong somewhere (please say where, in that case)

giuliohome · May 1, 2022, 8:34pm

Indeed, I’ve just solved it by generalizing the azure specific csi driver to a more abstract persistent volume claim

I see my grafana dashboard on google cloud, finally!

Topic		Replies	Views
Cpu kubernetes bugs in google cloud General Discussions	1	515	October 9, 2019
GKE won't scale down General Discussions	1	1597	January 4, 2020
Autopilot cluster not responding General Discussions	3	1396	June 14, 2021
Kubernetes Vetical Pod Autoscaler wont recreate pods General Discussions	0	1057	April 24, 2022
Errors in implementing Kubernetes the hard way General Discussions	1	981	May 24, 2021

Google Cloud Engine: CPU quota exceeded for a simple cluster!?

Cluster information:

Related topics