klyst
July 29, 2024, 1:58pm
1
I am unable to delete a namespace on my baremetal Kubernetes 1.28.2 cluster.
When I describe the namespace, it indicates that there is one remaining resource, which doesn’t make sense because I just created the namespace for this example:
$ kubectl create ns test-namespace3
namespace/test-namespace3 created
$ kubectl delete ns test-namespace3
namespace "test-namespace3" deleted
$ kubectl describe ns test-namespace3
Name: test-namespace3
Labels: kubernetes.io/metadata.name=test-namespace3
Annotations: <none>
Status: Terminating
Conditions:
Type Status LastTransitionTime Reason Message
---- ------ ------------------ ------ -------
NamespaceDeletionDiscoveryFailure False Mon, 29 Jul 2024 15:02:30 +0200 ResourcesDiscovered All resources successfully discovered
NamespaceDeletionGroupVersionParsingFailure False Mon, 29 Jul 2024 15:02:30 +0200 ParsedGroupVersions All legacy kube types successfully parsed
NamespaceDeletionContentFailure True Mon, 29 Jul 2024 15:02:30 +0200 ContentDeletionFailed Failed to delete all resource types, 1 remaining: Internal error occurred: error resolving resource
NamespaceContentRemaining False Mon, 29 Jul 2024 15:02:30 +0200 ContentRemoved All content successfully removed
NamespaceFinalizersRemaining False Mon, 29 Jul 2024 15:02:30 +0200 ContentHasNoFinalizers All content-preserving finalizers finished
No resource quota.
No LimitRange resource.
interpretation: It doesn’t seem to be a problem of finalizer, as the describe log indicates that all finalizers are completed.
When I check the controller logs:
E0729 13:20:49.553118 1 namespace_controller.go:159] deletion of namespace test-namespace3 failed: Internal error occurred: error resolving resource
There is nothing related to test-namespace3
in the API server logs:
$ kubectl logs kube-apiserver-edge-master.seed -n kube-system | grep test-namespace3
# Nothing
It
Does anyone have any idea where the problem might be coming from?
Cluster information:
Kubernetes version: 1.28.2
Cloud being used: bare-metal
Installation method: kubeadm
Host OS: Ubuntu
CNI and version: cilium 1.14.3
CRI and version: cri-o 1.28.1
From what I understand:
You just created a namespace and deleted it
After this, the namespace is not deleted
In your describe command we have this information:
NamespaceDeletionContentFailure True Mon, 29 Jul 2024 15:02:30 +0200 ContentDeletionFailed Failed to delete all resource types, 1 remaining: Internal error occurred: error resolving resource
Do you have any other issue? Is your control plane running well without any issue?
klyst
July 29, 2024, 3:48pm
4
Hello @AChichi
Thanks for your answer. You understand it correctly.
From my kube-controller, I’ve this (I filtered only the errors) :
kubectl logs kube-controller-manager-edge-master.seed -n kube-system -f | grep "^E"
E0729 15:31:39.027869 1 reflector.go:147] vendor/k8s.io/client-go/metadata/metadatainformer/informer.go:106: Failed to watch *v1.PartialObjectMetadata: failed to list *v1.PartialObjectMetadata: Internal error occurred: error resolving resource
E0729 15:31:41.557125 1 shared_informer.go:314] unable to sync caches for garbage collector
E0729 15:31:41.557164 1 garbagecollector.go:261] timed out waiting for dependency graph builder sync during GC sync (attempt 14422)
E0729 15:31:46.121924 1 namespace_controller.go:159] deletion of namespace auth failed: Internal error occurred: error resolving resource
E0729 15:31:47.388350 1 namespace_controller.go:159] deletion of namespace test-namespace3 failed: Internal error occurred: error resolving resource
E0729 15:32:11.660090 1 shared_informer.go:314] unable to sync caches for garbage collector
E0729 15:32:11.660136 1 garbagecollector.go:261] timed out waiting for dependency graph builder sync during GC sync (attempt 14423)
E0729 15:32:21.426415 1 reflector.go:147] vendor/k8s.io/client-go/metadata/metadatainformer/informer.go:106: Failed to watch *v1.PartialObjectMetadata: failed to list *v1.PartialObjectMetadata: Internal error occurred: error resolving resource
I get this log loop
Then from the kube-apiserver, I got this (again, only erros thanks to grep):
kubectl logs kube-controller-manager-edge-master.seed -n kube-system -f | grep "^E"
E0729 15:12:47.684482 1 customresource_handler.go:301] unable to load root certificates: unable to parse bytes as PEM block
For this one, I don’t think that it’s the root cause, because all kubectl command are running correctly except the delete namespace.
klyst
July 30, 2024, 2:51pm
5
I was able to resolve the error.
Here are the steps I took:
I listed the resources that have a namespaced scope (i.e., those that are not global to the cluster):
LIST=$(kubectl api-resources --verbs=list --namespaced -o name | tr "\n" " ")
I checked for any CRDs that are in error among this list:
for elt in $(echo $LIST); do kubectl get --show-kind --ignore-not-found -n <ANY terminating ns> $elt; done
I found that inferenceservices.serving.kserve.io
is causing an issue, so I ran a describe command:
kubectl describe crd inferenceservices.serving.kserve.io
Message: could not list instances: unable to find a custom resource client for [inferenceservices.serving.kserve.io]: unable to load root certificates: unable to parse bytes as PEM block
Reason: InstanceDeletionFailed
Status: True
Type: Terminating
There is a error message, and a Terminating step.
This is the same issue here : fail installing KServe · Issue #2349 · kubeflow/manifests · GitHub
Since this resource is in a terminating state and is used in my namespace, I removed its finalizer and then deleted the resource:kubectl patch crd/inferenceservices.serving.kserve.io -p '{"metadata":{"finalizers":[]}}' --type=merge
kubectl delete crd inferenceservices.serving.kserve.io
After completing these steps, I successfully removed all terminating namespaces.