It is important to recognise that things can go wrong. But MicroK8s gives you tools to help work out what has gone wrong, as detailed below. Be sure to check out the common issues section for help resolving the most frequently encountered problems.
Checking logs
If a pod is not behaving as expected, the first port of call should be the logs.
Assuming you run a simple workload in a namespace called redis
, first determine the resource identifier for the pod:
microk8s kubectl get pods -n redis
This will list the currently available pods, for example:
NAME READY STATUS RESTARTS AGE
redis 0/1 ErrImagePull 0 84m
You can then use kubectl
to view the log. For example, for the simple redis pod above:
microk8s kubectl logs redis -n redis
Error from server (BadRequest): container "redis" in pod "redis" is waiting to start: image can't be pulled
If this information is not sufficient, you can look into the events to find out more:
microk8s kubectl get events -n redis
LAST SEEN TYPE REASON OBJECT MESSAGE
5m9s Warning Failed pod/redis Failed to pull image "redis:XXlatest": failed to pull and unpack image "docker.io/library/redis:XXlatest": failed to resolve reference "docker.io/library/redis:XXlatest": unexpected status from HEAD request to https://www.docker.com/: 403 Forbidden
5m9s Warning Failed pod/redis Error: ErrImagePull
3m59s Warning BackOff pod/redis Back-off restarting failed container redis in pod redis_redis(b4ec0dac-609d-48c2-955a-1d6abc1c42b0)
In this specific case there is no such image in the image registry, thus the pod specification needs to be adjusted.
Examining the configuration
If the problem you are experiencing indicates a problem with the configuration of the Kubernetes components themselves, it could be helpful to examine the arguments used to run these components.
These are available in the directory ${SNAP_DATA}/args
, which on Ubuntu should point to /var/snap/microk8s/current
.
Note that the $SNAP_DATA
environment variable itself is only available to the running snap. For more information on the snap environment, check the snap documentation.
Using the built-in inspection tool
MicroK8s ships with a script to compile a complete report on MicroK8s and the system which it is running on. This is essential for bug reports, but is also a useful way of confirming the system is (or isn’t) working and collecting all the relevant data in one place.
To run the inspection tool, enter the command (admin privilege is required to collect all the data):
sudo microk8s inspect
You should see output similar to the following:
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-flanneld is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-apiserver-kicker is running
Service snap.microk8s.daemon-proxy is running
Service snap.microk8s.daemon-kubelet is running
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy current linux distribution to the final report tarball
Copy openSSL information to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Building the report tarball
Report tarball is at /var/snap/microk8s/1031/inspection-report-20191104_153950.tar.gz
This confirms the services that are running, and the resultant report file can be viewed to get a detailed look at every aspect of the system.
Common issues
Node is not ready when RBAC is enabled...
Ensure the hostname of your machine name does not contain capital letters or underscores. Kubernetes normalizes the machine name causing its registration to fail.
To fix this you can change the hostname or use the --hostname-override
argument in kubelet’s configuration in /var/snap/microk8s/current/args/kubelet
.
My dns and dashboard pods are CrashLooping...
The cni network plugin used by MicroK8s creates a vxlan.calico
interface (cbr0
on pre v1.16 releases or cni0
in
pre v1.19 releases and non-HA deployments) when the first pod is
created.
If you have ufw
enabled, you'll need to allow traffic on
this interface:
sudo ufw allow in on vxlan.calico && sudo ufw allow out on vxlan.calico
sudo ufw allow in on cali+ && sudo ufw allow out on cali+
My pods can't reach the internet or each other (but my MicroK8s host machine can)...
Make sure packets to/from the pod network interface can be forwarded
to/from the default interface on the host via the iptables
tool.
Such changes can be made persistent by installing the iptables-persistent
package:
sudo iptables -P FORWARD ACCEPT
sudo apt-get install iptables-persistent
or, if using ufw
:
sudo ufw default allow routed
The MicroK8s inspect command can be used to check the firewall configuration:
microk8s inspect
A warning will be shown if the firewall is not forwarding traffic.
My log collector is not collecting any logs...
By default container logs are located in /var/log/pods/{id}
. You have to mount this location in your log collector for that to work. Following is an example diff for fluent-bit:
@@ -36,6 +36,9 @@
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
+ - name: varlibdockercontainers
+ mountPath: /var/snap/microk8s/common/var/lib/containerd/
+ readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
terminationGracePeriodSeconds: 10
@@ -45,7 +48,7 @@
path: /var/log
- name: varlibdockercontainers
hostPath:
- path: /var/lib/docker/containers
+ mountPath: /var/snap/microk8s/common/var/lib/containerd/
- name: fluent-bit-config
configMap:
name: fluent-bit-config
My pods are not starting and I use ZFS...
Microk8s switched to containerd
as its container runtime in release 492. When run on ZFS, containerd
must be configured to use ZFS snapshots. Presently neither Microk8s nor containerd
perform this automatically so you must manually update the configuration. Instructions on how to do this are documented here.
My home directory is not in /home
or is on NFS and I can't get Microk8s to work...
While not strictly a Microk8s issue, snaps generally do not work out of the box if your home directory is mounted via NFS, or if it is not located directly under /home
. See snapd
bugs #1662552 and #1620771 for further information and possible workarounds.
I need to recover my HA cluster
In normal use, a MicroK8s HA cluster is self healing. There may be occasions when testing edge versions or mixing releases that the cluster may need to be recovered. This docs page details the procedure.
Raspberry Pi and systems with low disk performance
The symptoms you may observe vary. You may experience the API server being slow, crashing or forming an unstable multi node cluster. Such problems are often traced to low performing or miss-configured disks. In the logs of the API server you will notice the data store being slow to write on disk.
With journalctl -f -u snap.microk8s.daemon-kubelite
or (for prior to v1.21) journalctl -f -u snap.microk8s.daemon-apiserver
you will see messages such as microk8s.daemon-kubelite[3802920]: Trace[736557743]: ---"Object stored in database" 7755ms
.
To identify if the a slow disk is affecting you, you could use hdparm
and try to write a large file with dd
, for example, hdparm -Tt /dev/sda
and dd if=/dev/zero of=/tmp/test1.img bs=1G count=1
In systems such as the Raspberry Pi the issue may caused by devices not fully implementing the UAS specification.
In some cases, a way to mitigate the issue is to move the journald logs on volatile storage. This is done by editing /etc/systemd/journald.conf
setting Storage=volatile
.
Of course, you can always consider upgrading the attached storage.
The issue API Server hanging on raspberry pi · Issue #2280 · canonical/microk8s · GitHub demonstrates a successful debugging of this issue.
"access denied" error on Debian 9
Snapctl coming with Debian 9 is outdated. Here is how to replace it with a fresh one:
sudo snap install core
sudo mv /usr/bin/snapctl /usr/bin/snapctl.old
sudo ln -s /snap/core/current/usr/bin/snapctl /usr/bin/snapctl
I get "i/o timeouts" when calling "microk8s kubectl logs"
Make sure your hostname resolves correctly to the IP address of your host or localhost. The following error may indicate this misconfiguration:
microk8s kubectl logs
Error from server: Get "https://hostname:10250/containerLogs/default/...": dial tcp host-IP:10250: i/o timeout
One way to address this issue is to add the hostname and IP details of the host in /etc/hosts
. In the case of a multi-node cluster, the /etc/hosts
on each machine has to be updated with the details of all cluster nodes.
I get "Unable to connect to the server: x509" on a multi-node cluster
This indicates that the certificates are not being regenerated correctly to reflect network changes. A workaround is to temporarily rename the file found at: /
var/snap/microk8s/current/var/lock/no-cert-reissue
Calico controller fails on Raspberry Pi with Ubuntu 21.10
Extra kernel modules are needed on RPi after upgrading to Ubuntu 21.10. Install those with sudo apt install linux-modules-extra-raspi
. You may need to restart MicroK8s afterwards.
Pod communication problems when using firewall-cmd (Fedora etc)
On systems which use firewall-cmd, pods are unable to communicate with each other because the firewall drops the packets. To check if this is the case, do the following:# get the subnet cidr the pods are using
SUBNET=`cat /var/snap/microk8s/current/args/cni-network/cni.yaml | grep CALICO_IPV4POOL_CIDR -a1 | tail -n1 | grep -oP '[\d\./]+'`
echo $SUBNET
# enable logging of denied packets
sudo firewall-cmd --set-log-denied=all
sudo firewall-cmd --reload
# e.g. restart a pod and check for denied packets with dmesg
# (look for packets having an IP from the SUBNET above as SRC)
dmesg | grep -i REJECT
Solution: Create a dedicated zone for the microk8s subnet to avoid packets being dropped:
# if you see packets being rejected create a dedicated zone for microk8s:
sudo firewall-cmd --permanent --new-zone=microk8s-cluster
sudo firewall-cmd --permanent --zone=microk8s-cluster --set-target=ACCEPT
sudo firewall-cmd --permanent --zone=microk8s-cluster --add-source=$SUBNET
sudo firewall-cmd --reload
# finally reset the logging
sudo firewall-cmd --set-log-denied=off
sudo firewall-cmd --reload
I get "This node does not have enough RAM to host the Kubernetes control plane services"
MicroK8s will refuse to start on machines with less than 512MB available RAM, in order to prevent the system from running out of memory. It is suggested that these nodes are added as worker-only nodes to an existing cluster.
If you still wish to start the control plane services, you can do:
microk8s start --disable-low-memory-guard
Reporting a bug
If you cannot solve your issue and believe the fault may lie in MicroK8s, please file an issue on the project repository.
To help us deal effectively with issues, it is incredibly useful to include the report obtained from microk8s inspect
, as well as any additional logs, and a summary of the issue.