GKE Cluster problem detected

Cluster information:

Kubernetes version: Master: 1.14.10-gke.50 Nodes: 1.11.9-gke.13
Cloud being used: GKE
Installation method: Google Cloud
Host OS: Google Container Optimized OS

Hi everyone,

Yesterday in the Google Cloud GKE panel appeared the following message above my production cluster details:

Cluster problem detected. Please refer to this page to fix this.

I can access the general details of the cluster but I can’t access the PVs and Nodes information.
Commands like

kubectl get nodes
kubectl get pv
kubectl get pvc
kubectl get pods

work and return the correct data.

Also if I try to do any kind of node pool operation with gcloud I get the following error:

ERROR: (gcloud.container.node-pools.create) ResponseError: code=400, message=Cluster is currently being created, deleted, updated or repaired and cannot be updated.

I’ve followed every step on that page but the problem continues, and I don’t know what to do now.
Any ideas about what is causing the issue?

Thanks

It’s a very old thread but it’s the only result on Google when searching for this exact error message.

I ran into the issue because my authoritative google_project_iam_policy Terraform policy wiped out any default GCP policies.

What is NOT mentioned in the Google troubleshooting document is that the roles roles/compute.serviceAgent and roles/compute.networkUse needs to be granted to these service accounts as well (in addition to your cluster service accounts):

  • serviceAccount:service-${project-number}@compute-system.iam.gserviceaccount.com
  • serviceAccount:service-${project-number}@container-engine-robot.iam.gserviceaccount.com

With these set, cluster function is restored.