ziad
February 27, 2023, 11:29am
1
Cluster information:
Kubernetes version: v1.26.1
Cloud being used: bare metal
Installation method: kubeadm
Host OS: CentOS stream 8
CNI and version: 0.3.1, flannel with RBAC integrated
CRI and version: cgroup as installed with Docker 20.10.17
My kubeadm
config:
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.26.0
networking:
podSubnet: 10.244.0.0/16 # --pod-network-cidr
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: cgroupfs
My pods:
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel kube-flannel-ds-ph2mx 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system coredns-787d4945fb-9k5cl 0/1 Pending 0 2d19h <none> <none> <none> <none>
kube-system coredns-787d4945fb-vn7l6 0/1 Pending 0 2d19h <none> <none> <none> <none>
kube-system etcd-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-apiserver-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-controller-manager-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-proxy-5dltp 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-scheduler-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
And finally the control-plane nodes:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
scc-fineci-amd NotReady control-plane 2d19h v1.26.1
The problem is that coredns
stays in Pending
state, and the control-plane
remains NotReady
. Not sure what I did wrong.
Is it maybe cgroupfs
? I kinds have my hands tied there cause I have a LOT of docker activity on that server that could be disrupted.
2 Likes
ziad:
coredns-787d4945fb-9k5cl
What is the result of kubectl describe pod coredns-787d4945fb-9k5cl -n kube-system
?
ziad
March 1, 2023, 10:21am
3
felixdpg:
coredns-787d4945fb-9k5cl
Here is the output:
# kubectl describe pod coredns-787d4945fb-9k5cl -n kube-system
Name: coredns-787d4945fb-9k5cl
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: coredns
Node: <none>
Labels: k8s-app=kube-dns
pod-template-hash=787d4945fb
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-787d4945fb
Containers:
coredns:
Image: registry.k8s.io/coredns/coredns:v1.9.3
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w2r49 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-w2r49:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 24s (x1379 over 4d18h) default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
Warning FailedScheduling 24s (x1379 over 4d18h) default-scheduler **0/1 nodes are available**: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
. Have you deployed network plugin?
CoreDNS will always remain pending until the pod network is up. It essentially has no network to exist on until then.
2 Likes
ziad
March 1, 2023, 11:42am
6
I have flannel installed by having applied its YAML config with kubectl
. By network plugin do you mean something like this GitHub - flannel-io/cni-plugin ?
I did not see any requirement for it in the official K8s installation docs. Reading in its page it says it will create a flannel
exec under /bin
, but I already have one under /opt/cni/bin
(off-path) that is linked to from /usr/bin
(on-path). Will these execs conflict?
ziad
March 1, 2023, 1:01pm
8
Fixed it! It was the classic ephemeral storage issue. I added some less restrictive eviction manager settings in /var/lib/kubelet/config.yaml
, restarted kubelet.service
, and it’s all good now.
This is the snippet added to /var/lib/kubelet/config.yaml
, and which is the generally accepted remedy to an overly aggressive eviction manager:
evictionHard:
imagefs.available: 1%
memory.available: 100Mi
nodefs.available: 1%
nodefs.inodesFree: 1%
After that restart kubelet.service
(prepend with sudo
if not root
user):
systemctl restart kubelet
Check your nodes and pods:
$> kubectl get nodes
NAME STATUS ROLES AGE VERSION
scc-fineci-amd Ready control-plane 4d21h v1.26.1
$> kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-ph2mx 1/1 Running 1 (107m ago) 4d21h
kube-system coredns-787d4945fb-9k5cl 1/1 Running 0 4d21h
kube-system coredns-787d4945fb-vn7l6 1/1 Running 0 4d21h
kube-system etcd-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-apiserver-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-controller-manager-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-proxy-5dltp 1/1 Running 1 (107m ago) 4d21h
kube-system kube-scheduler-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
1 Like
oussema
December 9, 2023, 12:07pm
9
I got the Same problem coredns
stays in Pending
.
I tried the solution of adding the configuration in /var/lib/kubelet/config.yaml
evictionHard:
imagefs.available: 1%
memory.available: 100Mi
nodefs.available: 1%
nodefs.inodesFree: 1%
But Still the same problem.
1 Like