If lease from pod-1 is still valid. I assume pod-2 should not be able to acquire a lease. So there should be some mechanism to prevent pod-2 to update the Lease. But then I run: kubectl patch Lease my-lease -p '{"spec":{"holderIdentity":"pod-2","renewTime":"2024-10-29T23:57:37.059086Z"}}'
it just responded with lease.coordination.k8s.io/my-lease patched
and the Lease has been updated.
So it seems my assumptions are wrong? How does the Lease Lock work in k8s?
Thanks I read the code but where does the Lock happens? If these logic all happens in the pod itself it would be inaccurate since if leader pod is down. pod-1 and pod-2 could simultaneously see that leader is down and try to claim it.
I’ve dig down to this leaselock.go file but this file does not seems to have any lock mechanism applied to it. ( it just calls ll.Client.Leases().Update() which I assume is identical to kubectl patch Lease ? )
Let’s say pod-1 is down.
pod-2 sees it so pod-2 issued an update to the Lease and got a success.
in the meantime pod-3 also sees this and issued an update to the Lease. I assume this should be failing or should not happen since the code prevents it?
Where exactly in the code that prevents this? Since pod-2 and pod-3 runs the same code at the same time. Following the code, they will be getting the same responses from the api-server, that
the lease has expired
ll.Client.Leases().Update() returns no error
there should be a shared lock somewhere but I can’t find it.
I assume you are missing the point. pod-2 does not care whether pod-1 is down or not. pod-1 might get restarted and remain leader (if get back up real quick). pod-2 is going to check whether the lease record was updated or not. If the lease record has not been updated for --leader-elect-lease-duration after the last update, then pod-2 is going to try to apply lock.
Yes I understand this. But let’s say pod-1 is really down and won’t get back up.
pod-2 is going to do it’s due diligence ( check update time, etc ) then decided to acquire the lease.
pod-3 is doing the same thing at the same time right? So it will try to acquire the lease too.
In the start, Lease holder is pod-1 but expired
pod-2 acquires the lease (Lease holder: pod-2)
pod-3 acquires the lease in like…1ms after (Lease holder: pod-3)
pod-2 thinks it has the lease. Now it do the leader thing.
pod-3 thinks it has the lease too. Now it do the leader thing.
Now we ended up with 2 pods thinking it’s the leader. Which should be prevented.