MicroK8s failed to join RPI cluster error code 500

robtheslob · February 9, 2021, 11:46pm

Hello there. I can’t for the life of me figure out why the designated leaf nodes can’t join the master node’s cluster? I’m following the documentation correctly from the ubuntu tutorial on microk8s and the official microk8s documentation page here: MicroK8s - Clustering with MicroK8s

I can issue the add node command on the master node fine, the join command I paste into one of the leaf nodes in order to join them into the cluster to become leaf nodes… fails with error code 500 but isn’t helpful. Could anyone point me in the right direction? I’m using Ubuntu server for ARM.

I am using carrier grade NAT, a 4G router with a 4G simcard inside; could this be the cause of the problem? I thought that it would at least join the cluster due to it not requiring WAN connectivity, as it’s on the same LAN (192.168.0.X) Thanks for your time.

The join command just returns ‘contacting cluster at 192.168.0.125’
‘failed to join cluster, error code 500’

The error is the same on all PI 3 Bs and PI 4, just much faster at executing on the PI 4.

balchua1 · February 10, 2021, 12:41am

Hi,
There is a cluster-agent service running on the main node. In your example, 192.168.0.125.

journalctl -u snap.microk8s.daemon-cluster-agent. It might give some hint.

robtheslob · February 10, 2021, 2:04am

Shows no logs. I followed this tutorial exactly:

I’ve tried enabling DMZ to the master node PI incase it was a port forwarding issue. Could carrier grade NAT be blocking port 25000 on LAN regardless of router settings? I may try plugging everything into the dial up ADSL connection we’re about to cancel, just to see if it does resolve things. The 4G router / Carrier grade NAT is 3 megabytes per second as opposed to dial up speeds, which is why I use it.

balchua1 · February 10, 2021, 12:19pm

Hi if you have some firewall in between, check the ports MicroK8s uses.

robtheslob · February 10, 2021, 1:46pm

Hello, thanks for your reply. There is no firewall enabled on the master PI and DMZ is enabled to the master PI, I’m willing to try anything though so I enabled ufw and allowed all of the ports on the MicroK8s services and ports page, for all PIs.

Still no luck any other suggestions before I plug the PIs into an ADSL router to see if that resolves it? I’m currently running the microk8s.inspect command but it just hangs on the inpecting cluster heading forever.

Edit - same results on ADSL connection so it wasn’t carrier grade nat after all. I’m going to try removing the microk8s snap packages and installing a newer build. There’s no problems in the network configuration, DMZ has been enabled when using cluster on both ADSL and 4G router, firewall wasn’t enabled but has been enabled anyway and allowed all microk8s specific ports. I’m not sure what else I can try and I’ve hit a wall with my uni dissertation due to this error and there doesn’t seem to be any indication of what’s causing it.

robtheslob · February 10, 2021, 4:15pm

Image of same result on different LAN using ADSL router instead of 4G router to rule out carrier grade LAN as the issue:

robtheslob · February 10, 2021, 4:21pm

I’ve tried all of 3 addresses, same results

I don’t understand why this issue is so uncommon, I’m starting to wonder if it’s Ubuntu server but it’s anyone’s guess at this point. Hopefully someone can point me in the right direction.

robtheslob · February 10, 2021, 4:29pm

I tried to run the ‘microk8s.enable dns storage’ command and even that failed, something is clearly wrong with the microk8s config on the master PI. The microk8s status command just hangs forever also.

robtheslob · February 10, 2021, 5:34pm

Doesn’t seem to be starting?

JensF · February 15, 2021, 11:31pm

I am suffering the same issue. My system automatically updated to 1.20.2 today which killed my cluster. All worker nodes went into “Not Ready” state and I did not have a chance to reactivate them. It is the second time that something goes wrong due to a SNAP update and I do not really find a good guide how to disable automatic updates generally (maybe I just set a time I will never see being alive).

While MicroK8S is lovely, these issues are really annoying, particularly since I almost finsihed my move of Docker single host to a 4 node cluster

Thx for helping getting this fixed and stable. I will now try to backport and install v1.19/stable

Jens

robtheslob · February 16, 2021, 12:12am

I ended up just installing k3s, successfully got my pi3b workers up although they’re not doing anything yet.

JensF · February 16, 2021, 12:19am

this is pretty frustrating, I am not even getting MicroK8S 1.19 up and running anymore, despite running snap remove microk8s --purge

The 1.19 version after disabling HA cluster always complained it cannot connect nodes to a DQLITE / HA-Cluster despite all nodes and the master were disabled…

I give it one last try with installing snap microk8s --classic, leaving HA enabled. If I again cannot add nodes I will revert to kubespray (as this will not kill my cluster with auto-updates)

JensF · February 16, 2021, 12:28am

Ok, this was working now: snap install microk8s --classic >> so far I always added the channel=1.20/stable (or 1.19,…) > so far as well always deactivated HA-Cluster since I did want to use all 4 nodes and not spare one.

Decided now to try the native install and as well monitor for another snap update. If all fails once more I have to reconsider since I want to move to production and need reliability.

If anyone has ideas what is causing these effects, that would be very interesting.

robtheslob · February 16, 2021, 12:36am

It seems more people here are running k3s than microk8s hence why I switched, it just worked thankfully. (Coming from a noob) perhaps use that or kubespray if you are already familiar with it. The kubernetes discord channel were helpful, if you want to ask for help identifying the cause of your issues in there.

balchua1 · February 16, 2021, 1:32am

@JensF @robtheslob sorry about your issues.
There’s a fix coming into 1.20 and 1.19 in the next few days concerning the memory leaks on dqlite.

kjackal · February 16, 2021, 10:19am

Hi @JensF , you can set the date and time when snap updates will reach you as described in Managing updates | Snapcraft documentation

JensF · February 16, 2021, 11:49am

Yes I am aware, thx for reminding. I just did not find a full opt out from automated updates - maybe just picking a date in far future…

Yet: I do as well like the possibility to stay up2date but am frustrated that I had two times issues with 2 releases this is 100% of all updates fail for me

I no left the HA-Cluster enabled, looks as well all 4 nodes are being scheduled not as when I first tried HA and one node was set aside for spare.

Will see what happens with 1.20.3 update - is that any way to subscribe to release notifications?

Thx
Jens

a-hahn · March 11, 2021, 5:42pm

Continuing the discussion from MicroK8s failed to join RPI cluster error code 500:

Yes, thats pretty frustrating. Same problems here. It seems not possible with microk8s to create a cluster if the cluster members are on different physical hardware. I got it working however if i only use one server having multiple lxc containers to build a cluster. As soon as multiple servers are involved it just doesn’t work. Even worse - joining a cluster from a remote seems to work ok, no complaints on the joining cluster but then the master and all other nodes are no longer accessible. kubectl runs into timeout same with microk8s inspect.

Kim_Asplund · October 16, 2021, 12:35pm

maybe a bit late in reply but… It seems to take issues with capital letters and - in the hoastname.

jmarkd · March 30, 2022, 1:27am

Havin the same problem as mentioned above. Ready to start over, decided 'What the hell, try the “microk8s.add-node” command and use the “microk8s.join 192.1… --worker” from that instance and it worked. Tried it on my fourth node, with the same success.

Topic		Replies	Views
Clustering with MicroK8s: adding node does not work with release > 1.18 microk8s	3	3270	March 4, 2021
Microk8s - Some nodes join cluster, some don't microk8s	7	6746	August 13, 2020
Microk8s join error: Command returned non-zero exit status 1 microk8s	0	734	February 27, 2023
Kubernetes and microk8s join error: Joining failed microk8s microk8s	1	941	September 9, 2022
Error joining node to cluster microk8s	3	2124	July 9, 2022

MicroK8s failed to join RPI cluster error code 500

Related topics