Hello there, fellow Kubernetes administrator here. Microk8s caught my interest and I’m currently experimenting with it. During my experiments I noticed a strange behavior, as soon as the cluster is created about 4GiB of RAM are in use, as time goes on the cluster memory usage continues growing about 0.25GiB per hour, sometimes even more. I’d like to isolate the problem but have been unable to do so.
Here’s some information about the setup used:
3 Raspberry PI 4 4GB
Ubuntu 20.04.1 with fresh install
Microk8s 1.19/stable
HA enabled
I noticed this behavior through Lens during my tests, however the strange increments in memory utilization are present in a freshly-installed cluster. I verified this happens through Lens (Prometheus) and kubectl top (metrics-server). When I installed Rook/Ceph on cluster the problem became more apparent as monitors slowly started consuming more and more memory even though the OSDs are actually stopped and no pod is consuming resources offered by Ceph.
At some point the cluster stops responding (no connection to the apiserver), the (systemd units) services are still running on each node but the resources are still committed, no (Kubernetes) service or internal responds.
Logs are then flooded with:
apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}
E1125 leaderelection.go:321] error retrieving resource lock kube-system/kube-controller-manager: Get "https://127.0.0.1:16443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s": context deadline exceeded
and similar messages.
Rebooting all the hosts restores the cluster to the precedent state, however the problem persists. Has somebody encountered this issue? Is it only limited to RPI4+MK8S? Thank you.
@marksei, you are not alone and it’s not RPI specific either. See this github issue for current status. There was some issue with database corruption which has been addressed and we’re now trying to see what’s going on with this memory leak. You should be able to restart just the apiserver service which will recover the memory without affecting the pods on the cluster. At least one users is reporting that using max memory via systemd is handling restarts with out noticable affect.
That’s clearly not the long term solution and we’re actively working to track it down. To stay up to date I recommend following the above github issue.
Thank you, I’ll keep an eye on the issue, I had imagined this workaround was indeed possible, but it is way too hack-y to suit my taste. So the problem lies within Dqlite, am I correct? To add something to the issue, I can’t even use microk8s status, the command just hangs (kubectl whatever isn’t even an option).
I’ll avoid guessing where specifically the memory leak comes from. If I was confident in that answer I’d submit a patch to fix it
Not being able to use microk8s status is not the same issue as what I’m describing above. You might be better served checking for a relevant bug or filling one with some logs so someone can look at your specific issue with the status command.
Thank you Chris, I was hoping to get a few more insights to hunt the bug down myself.
I do believe the two issues are correlated or probably the same. From what I’ve read this only happens in HA configuration. From my observations no process get OOM killed in my setup, the apiserver is still running, however the cluster is unresponsive and all the nodes (probably because they are identical) fail almost at the same time. Another thing I’ve noticed is that the reported usage is congruent to what the OS reports, in time I noticed that some pods (especially Rook and Calico ones) grow in memory usage over time, slowly but steadily. If the memory leak was in apiserver or dqlite itself it’d be fair to assume one of those two processes (or any other process) would get killed over time, but that doesn’t happen to me. What I’m observing is close to this behavior. I’ll continue searching for the cause.