ziad
February 27, 2023, 11:29am
1
Cluster information:
Kubernetes version: v1.26.1
Cloud being used: bare metal
Installation method: kubeadm
Host OS: CentOS stream 8
CNI and version: 0.3.1, flannel with RBAC integrated
CRI and version: cgroup as installed with Docker 20.10.17
My kubeadm
config:
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.26.0
networking:
podSubnet: 10.244.0.0/16 # --pod-network-cidr
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: cgroupfs
My pods:
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel kube-flannel-ds-ph2mx 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system coredns-787d4945fb-9k5cl 0/1 Pending 0 2d19h <none> <none> <none> <none>
kube-system coredns-787d4945fb-vn7l6 0/1 Pending 0 2d19h <none> <none> <none> <none>
kube-system etcd-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-apiserver-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-controller-manager-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-proxy-5dltp 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
kube-system kube-scheduler-scc-fineci-amd 1/1 Running 0 2d19h 141.52.72.27 scc-fineci-amd <none> <none>
And finally the control-plane nodes:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
scc-fineci-amd NotReady control-plane 2d19h v1.26.1
The problem is that coredns
stays in Pending
state, and the control-plane
remains NotReady
. Not sure what I did wrong.
Is it maybe cgroupfs
? I kinds have my hands tied there cause I have a LOT of docker activity on that server that could be disrupted.
ziad:
coredns-787d4945fb-9k5cl
What is the result of kubectl describe pod coredns-787d4945fb-9k5cl -n kube-system
?
ziad
March 1, 2023, 10:21am
3
felixdpg:
coredns-787d4945fb-9k5cl
Here is the output:
# kubectl describe pod coredns-787d4945fb-9k5cl -n kube-system
Name: coredns-787d4945fb-9k5cl
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: coredns
Node: <none>
Labels: k8s-app=kube-dns
pod-template-hash=787d4945fb
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-787d4945fb
Containers:
coredns:
Image: registry.k8s.io/coredns/coredns:v1.9.3
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w2r49 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-w2r49:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 24s (x1379 over 4d18h) default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
Warning FailedScheduling 24s (x1379 over 4d18h) default-scheduler **0/1 nodes are available**: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
. Have you deployed network plugin?
CoreDNS will always remain pending until the pod network is up. It essentially has no network to exist on until then.
ziad
March 1, 2023, 11:42am
6
I have flannel installed by having applied its YAML config with kubectl
. By network plugin do you mean something like this GitHub - flannel-io/cni-plugin ?
I did not see any requirement for it in the official K8s installation docs. Reading in its page it says it will create a flannel
exec under /bin
, but I already have one under /opt/cni/bin
(off-path) that is linked to from /usr/bin
(on-path). Will these execs conflict?
ziad
March 1, 2023, 1:01pm
8
Fixed it! It was the classic ephemeral storage issue. I added some less restrictive eviction manager settings in /var/lib/kubelet/config.yaml
, restarted kubelet.service
, and it’s all good now.
This is the snippet added to /var/lib/kubelet/config.yaml
, and which is the generally accepted remedy to an overly aggressive eviction manager:
evictionHard:
imagefs.available: 1%
memory.available: 100Mi
nodefs.available: 1%
nodefs.inodesFree: 1%
After that restart kubelet.service
(prepend with sudo
if not root
user):
systemctl restart kubelet
Check your nodes and pods:
$> kubectl get nodes
NAME STATUS ROLES AGE VERSION
scc-fineci-amd Ready control-plane 4d21h v1.26.1
$> kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-ph2mx 1/1 Running 1 (107m ago) 4d21h
kube-system coredns-787d4945fb-9k5cl 1/1 Running 0 4d21h
kube-system coredns-787d4945fb-vn7l6 1/1 Running 0 4d21h
kube-system etcd-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-apiserver-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-controller-manager-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h
kube-system kube-proxy-5dltp 1/1 Running 1 (107m ago) 4d21h
kube-system kube-scheduler-scc-fineci-amd 1/1 Running 1 (107m ago) 4d21h