Leaderelections failing, lease unable to be renewed automatically

janeosaka · January 17, 2023, 3:25pm

Hello,

I have a production cluster is currently running on K8s version 1.19.9, where the kube-scheduler and kube-controller-manager failed to have leader elections. The leader is able to acquire the first lease, however it then cannot renew/reacquire the lease, this has caused other pods to constantly in the loop of electing leaders as none of them could stay on long enough to process anything/stay on long enough to do anything meaningful and they time out, where another pod will take the new lease; this happens from node to node.

My duct tape recovery method was to shutdown the other candidates and disable leader elections --leader-elect=false. We manually set a leader and let it stay on for a while, then reactivated leader elections after. This has seemed to work as intended again, the leases are renewing normally after

Could it be possible that the api-server may be too overwhelmed to expend any resources(?), because the elections have failed due to timeout? Was wondering if anyone has ever encountered such an issue.

Topic		Replies	Views
Many restarts of controll-manager & scheduler General Discussions	4	5905	December 14, 2024
Trying to understand Lease Lock General Discussions	13	710	March 11, 2025
Kubeadm init fails. kube-scheduler fails with error retrieving resource lock kube-system/kube-scheduler: context deadline exceeded (Client.Timeout exceeded while awaiting headers) General Discussions	1	6108	June 2, 2023
Leader election utility General Discussions	2	1783	June 8, 2019
Kubernetes Podcast from Google: Leader Election, with Mike Danese General Discussions podcast	0	846	September 29, 2020

Leaderelections failing, lease unable to be renewed automatically

Related topics