After rebooting Ubuntu, Microk8s requires at least three hours to work normal

xiao_wang · October 3, 2024, 10:22am

microk8s 1.28 stable is installed on Ubuntu 22.04 LTS. Everything works fine.

After rebooting Ubuntu, mocrik8s needs at least three hours back to normal. Within the three hours, microk8s status shows “microk8s is running”. But sometimes, I got “The connection to the server 127.0.0.1:16443 was refused”.

Is the several hours waiting time normal? How to reduce the time?

evabrown2508 · October 4, 2024, 1:29pm

It’s not normal for MicroK8s to take several hours to get back to normal after rebooting. The issue might be related to system resources or services not starting correctly on boot. One common fix is to check if there are any delays with the snap services that manage MicroK8s, as they sometimes take longer to initialize. You can try restarting MicroK8s manually after reboot with sudo microk8s stop followed by sudo microk8s start, which may speed things up. Also, make sure that your system has enough CPU and memory, as resource constraints can cause delays. If the problem persists, checking logs (journalctl -u snap.microk8s.daemon-kubelet) for errors or timeouts might help identify the cause.

xiao_wang · October 7, 2024, 12:55pm

Thanks for reply.
I did: snap services; sudo microk8s stop; sudo microk8s start. It looks normal. The problem is still there. A bit different is sometime the waiting time is about 15 minutes while sometine the waiting time is over two hours.
When I do: journalctl -u snap.microk8s.daemon-kubelet, I got no entry.
So I replace kubelet with kubelite and do: journalctl -u snap.microk8s.daemon-kubelite, I see:

..........................................
Oct 07 12:23:01 ultimate-force microk8s.daemon-kubelite[1838410]: F1007 12:23:01.010321 1838410 daemon.go:46] Proxy exited open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory
Oct 07 12:23:01 ultimate-force systemd[1]: snap.microk8s.daemon-kubelite.service: Main process exited, code=exited, status=255/EXCEPTION
Oct 07 12:23:01 ultimate-force systemd[1]: snap.microk8s.daemon-kubelite.service: Failed with result 'exit-code'.
Oct 07 12:23:01 ultimate-force systemd[1]: snap.microk8s.daemon-kubelite.service: Consumed 4.591s CPU time.
Oct 07 12:23:01 ultimate-force systemd[1]: snap.microk8s.daemon-kubelite.service: Scheduled restart job, restart counter is at 1605.
Oct 07 12:23:01 ultimate-force systemd[1]: Stopped Service for snap application microk8s.daemon-kubelite.
Oct 07 12:23:01 ultimate-force systemd[1]: snap.microk8s.daemon-kubelite.service: Consumed 4.591s CPU time.
Oct 07 12:23:01 ultimate-force systemd[1]: Started Service for snap application microk8s.daemon-kubelite.
............................

I can see the kubelite server restart over 1000 time and each time requires about 4.5 seconds. So in total the waiting time will be 1000 x 4.5 that obver one hour.
My CPU is Intel Core i7-7820X, RAM is 32GB.

Any idea? I can provide more log if required.

Thanks.

sakiphan · October 8, 2024, 8:33am

Hi,

You might be experiencing an issue with disk I/O, especially if you have too many pods running. Additionally, could you check the output of the microk8s kubectl top node command? This is the first time I’ve seen such a problem.

Can you also provide the output of the command sudo sysctl net.netfilter.nf_conntrack_max? It seems that MicroK8s is unable to find the necessary module for the proxy configuration.

xiao_wang · October 9, 2024, 8:40am

Hi Sakiphan, thanks for help. My OS is Ubuntu 22.04 LTS with 512GB RAM. The attached is the inspect report file.

Here is the screen output:

administer@gdsl1:~$ microk8s kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gdsl1 1134m 1% 12492Mi 2%
administer@gdsl1:~$ sudo sysctl net.netfilter.nf_conntrack_max
[sudo] password for administer:
net.netfilter.nf_conntrack_max = 2359296
administer@gdsl1:~$ microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite

WARNING: Maximum number of inotify user watches is less than the recommended value of 1048576.
Increase the limit with:
echo fs.inotify.max_user_watches=1048576 | sudo tee -a /etc/sysctl.conf
sudo sysctl --system
Building the report tarball
Report tarball is at /var/snap/microk8s/7231/inspection-report-20241009_093214.tar.gz

(Attachment inspection-report-20241009_093214.tar.gz is missing)

xiao_wang · October 9, 2024, 8:53am

Sorry cannot attach the inspect report.

There are no more pods. Here is the output:

administer@gdsl1:~$ microk8s kubectl get pods -A
NAMESPACE     NAME                                         READY   STATUS    RESTARTS      AGE
ingress       nginx-ingress-microk8s-controller-24z7r      1/1     Running   5 (12d ago)   22d
kube-system   calico-kube-controllers-67cddf978-7962s      1/1     Running   1 (12d ago)   14d
kube-system   calico-node-ht59z                            1/1     Running   1 (12d ago)   14d
kube-system   coredns-864597b5fd-m27bn                     1/1     Running   5 (12d ago)   22d
kube-system   dashboard-metrics-scraper-5657497c4c-jq4j2   1/1     Running   5 (12d ago)   22d
kube-system   hostpath-provisioner-7df77bc496-s7cl8        1/1     Running   6 (12d ago)   22d
kube-system   kubernetes-dashboard-54b48fbf9-lsxtk         1/1     Running   5 (12d ago)   22d
kube-system   metrics-server-848968bdcd-6vhx4              1/1     Running   5 (12d ago)   22d
testjhub      continuous-image-puller-rs9k5                1/1     Running   4 (12d ago)   22d
testjhub      hub-6864cf6fd6-9x2d8                         1/1     Running   8 (11d ago)   22d
testjhub      proxy-5767957cf6-vmdlw                       1/1     Running   4 (12d ago)   22d
testjhub      user-scheduler-78fd69dc9-5rgmc               1/1     Running   4 (12d ago)   22d
testjhub      user-scheduler-78fd69dc9-kmgsd               1/1     Running   5 (12d ago)   22d

sakiphan · October 9, 2024, 12:16pm

sudo snap refresh
sudo rm -rf /var/lib/snapd/cache/*

Yesterday I got an error like yours on my test server. After the above commands I upgraded microk8s and reinstalled and it was fixed. If your yaml files are up to date you can follow the same path but you can try the above commands before reinstalling. Let me know (don’t forget to check your internet connection).

xiao_wang · October 10, 2024, 2:39pm

I did:

sudo snap refresh
sudo rm -rf /var/lib/snapd/cache/*

After that, my microk8s version is: microk8s v1.28.14 revision 7231.
Then I do first reboot. After 24 minutes, microk8s works back.
I do second reboot. After 4 hours, microk8s works back.
I do the third reboot. After 48 minutes, microk8s works back.

Do I need to reinstall microk8s to solve the problem?

xiao_wang · October 14, 2024, 12:58pm

After:

sudo snap refresh
sudo rm -rf /var/lib/snapd/cache/*

I reinstall microk8s 1.28/stable. I do the first computer reboot, after 50 minutes, the microk8s works back. Then I do the second computer reboot, after 66 minutes, the microk8s works back.

Then I reinstall computer with Ubuntu 22.04 LTS and microk8s 1.28/stable. After reboot the computer, I have to wait for 73 minutes untill the microk8s works back.

It looks the problem is still there. No matter to reinstall microk8s or Ubuntu.

xiao_wang · October 14, 2024, 2:29pm

I know microk8s can not be started without internet connection. What internet rquiremeny for starting the Ubuntu with microk8s? For example, microk8s needs to access what external web and port?

Topic		Replies	Views
Microk8s: after reboot I have "FAIL: Service snap.microk8s.daemon-kubelet is not running" microk8s service	5	6207	June 7, 2019
Microk8s service not starting microk8s	2	2294	October 30, 2023
New Ubuntu 24.04 LTS,new microk8s, pending on microk8s.status --wait-ready microk8s	0	246	October 23, 2024
Microk8s.reset is slow General Discussions	0	774	March 13, 2020
Microk8s 1.21 fails to start after controlled cluster shutdown and restart microk8s	2	3489	April 30, 2022

After rebooting Ubuntu, Microk8s requires at least three hours to work normal

Related topics