I created a 3 node cluster without issue and was able to run a few “hello world” type pods. After a couple of days my cluster was not running and I was not able to get it to start. microk8s start took a long time to complete and after it did “microk8s status” would report “microk8s is not running. Use microk8s inspect for a deeper inspection.”
Inspect would intermittently report that everything was running or that the api-server had failed, in either case the api-server was always unreachable.
the journal for api server shows this contiguous loop that repeats about once every second:
snap.microk8s.daemon-apiserver.service: Main process exited, code=killed, status=11/SEGV
snap.microk8s.daemon-apiserver.service: Failed with result ‘signal’.
snap.microk8s.daemon-apiserver.service: Scheduled restart job, restart counter is at 4.
Stopped Service for snap application microk8s.daemon-apiserver.
Started Service for snap application microk8s.daemon-apiserver.
…
snap.microk8s.daemon-apiserver.service: Main process exited, code=killed, status=11/SEGV
snap.microk8s.daemon-apiserver.service: Failed with result ‘signal’.
snap.microk8s.daemon-apiserver.service: Scheduled restart job, restart counter is at 5.
Stopped Service for snap application microk8s.daemon-apiserver.
Today I have the same issue with a cluster with two nodes - apiserver dies with status=11/SEGV. So HA is not enabled and I could not get valid backend files from another node. How to recover in this case?
backend folder:
/var/snap/microk8s/current/var/kubernetes/backend# ls -la
total 355788
drwxrwx— 2 root microk8s 4096 Oct 22 08:28 ./
drwxr-xr-x 3 root root 4096 Oct 20 12:16 …/
-rw-rw---- 1 root microk8s 4099760 Oct 22 06:43 0000000000949134-0000000000949304
-rw-rw---- 1 root microk8s 8380624 Oct 22 06:44 0000000000949305-0000000000949879
-rw-rw---- 1 root microk8s 8381480 Oct 22 06:44 0000000000949880-0000000000950465
-rw-rw---- 1 root microk8s 8387024 Oct 22 06:45 0000000000950466-0000000000951071
-rw-rw---- 1 root microk8s 7933552 Oct 22 06:45 0000000000951072-0000000000951650
-rw-rw---- 1 root microk8s 8378272 Oct 22 06:48 0000000000951651-0000000000952024
-rw-rw---- 1 root microk8s 8378600 Oct 22 06:49 0000000000952025-0000000000952570
-rw-rw---- 1 root microk8s 8378528 Oct 22 06:49 0000000000952571-0000000000953172
-rw-rw---- 1 root microk8s 8387816 Oct 22 06:49 0000000000953173-0000000000953789
-rw-rw---- 1 root microk8s 8387096 Oct 22 06:50 0000000000953790-0000000000954396
-rw-rw---- 1 root microk8s 8362400 Oct 22 06:50 0000000000954397-0000000000955002
-rw-rw---- 1 root microk8s 3477568 Oct 22 06:50 0000000000955003-0000000000955252
-rw-rw---- 1 root microk8s 8361432 Oct 22 06:53 0000000000955253-0000000000955617
-rw-rw---- 1 root microk8s 8369672 Oct 22 06:53 0000000000955618-0000000000956153
-rw-rw---- 1 root microk8s 3927640 Oct 22 06:54 0000000000956154-0000000000956441
-rw-rw---- 1 root microk8s 8386632 Oct 22 06:56 0000000000956442-0000000000956814
-rw-rw---- 1 root microk8s 8385008 Oct 22 06:58 0000000000956815-0000000000957221
-rw-rw---- 1 root microk8s 8378672 Oct 22 06:59 0000000000957222-0000000000957825
-rw-rw---- 1 root microk8s 3780080 Oct 22 06:59 0000000000957826-0000000000958116
-rw-rw---- 1 root microk8s 8295576 Oct 22 07:01 0000000000958117-0000000000958480
-rw-rw---- 1 root microk8s 48 Oct 22 07:01 0000000000958481-0000000000958481
-rw-rw---- 1 root microk8s 48 Oct 22 07:02 0000000000958482-0000000000958482
-rw-rw---- 1 root microk8s 808 Oct 22 07:49 0000000000958483-0000000000958502
-rw-rw---- 1 root microk8s 48 Oct 22 07:49 0000000000958503-0000000000958503
-rw-rw---- 1 root microk8s 48 Oct 22 07:49 0000000000958504-0000000000958504
-rw-rw---- 1 root microk8s 48 Oct 22 08:28 0000000000958505-0000000000958505
-rw-rw---- 1 root microk8s 2216 Oct 20 12:16 cluster.crt
-rw-rw---- 1 root microk8s 3272 Oct 20 12:16 cluster.key
-rw-rw---- 1 root microk8s 67 Oct 22 08:28 cluster.yaml
-rw-rw---- 1 root microk8s 61 Oct 20 12:24 info.yaml
srw-rw---- 1 root microk8s 0 Oct 22 06:59 kine.sock=
-rw-rw---- 1 root microk8s 32 Oct 20 12:16 metadata1
-rw-rw---- 1 root microk8s 75700904 Oct 22 06:54 snapshot-1-956427-151041951
-rw-rw---- 1 root microk8s 72 Oct 22 06:54 snapshot-1-956427-151041951.meta
-rw-rw---- 1 root microk8s 75251824 Oct 22 06:58 snapshot-1-957451-151331063
-rw-rw---- 1 root microk8s 72 Oct 22 06:58 snapshot-1-957451-151331063.meta
-rw-rw---- 1 root microk8s 64442368 Oct 22 07:01 snapshot-1-958476-151508802
-rw-rw---- 1 root microk8s 72 Oct 22 07:01 snapshot-1-958476-151508802.meta
These specific files seem awfully small, you could try moving them out of the way in case they are corrupted, and see if the apiserver is able to start.