Cluster information:
Kubernetes version: 1.19.9-gke.1400
Cloud being used: (put bare-metal if not on a public cloud)
Installation method:
Host OS: Ubuntu with docker
CNI and version:
CRI and version:
Hi, I am trying to create a gpu-pool in parallel with default-pool in Kubernetes cluster. First, I requested to increase my GPU quota (Limit name: NVIDIA P100 GPUs ) for a specific region and I received a confirmation mail that my request has been approved. But when I tried to create a gpu node, I got the following error:
Google Compute Engine: Not all instances running in IGM after 30.958909599s. Expected 1, running 0, transitioning 1. Current errors: [GCE_STOCKOUT]: Instance ‘gke-acai-prod-gpu-pool-753a2cbd-5kvq’ creation failed: The zone ‘projects/mcds-capstone-acai/zones/us-east1-c’ does not have enough resources available to fulfill the request. Try a different zone, or try again later.
Then I have created a new reservation defining 1 GPU requirement with the following requirements.
But I am still getting this error when I try to create a new cluster with a GPU node having similar configuration in Kubernetes engine:
Insufficient regional quota to satisfy request: resource “NVIDIA_P100_GPUS”: request requires ‘1.0’ and is short ‘1.0’. project has a quota of ‘1.0’ with ‘0.0’ available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=mcds-capstone-acai.
It would be great if anyone who faced similar issues can share their experiences. Thanks in advance.