If lease from pod-1 is still valid. I assume pod-2 should not be able to acquire a lease. So there should be some mechanism to prevent pod-2 to update the Lease. But then I run: kubectl patch Lease my-lease -p '{"spec":{"holderIdentity":"pod-2","renewTime":"2024-10-29T23:57:37.059086Z"}}'
it just responded with lease.coordination.k8s.io/my-lease patched
and the Lease has been updated.
So it seems my assumptions are wrong? How does the Lease Lock work in k8s?
Thanks I read the code but where does the Lock happens? If these logic all happens in the pod itself it would be inaccurate since if leader pod is down. pod-1 and pod-2 could simultaneously see that leader is down and try to claim it.
I’ve dig down to this leaselock.go file but this file does not seems to have any lock mechanism applied to it. ( it just calls ll.Client.Leases().Update() which I assume is identical to kubectl patch Lease ? )
Let’s say pod-1 is down.
pod-2 sees it so pod-2 issued an update to the Lease and got a success.
in the meantime pod-3 also sees this and issued an update to the Lease. I assume this should be failing or should not happen since the code prevents it?
Where exactly in the code that prevents this? Since pod-2 and pod-3 runs the same code at the same time. Following the code, they will be getting the same responses from the api-server, that
the lease has expired
ll.Client.Leases().Update() returns no error
there should be a shared lock somewhere but I can’t find it.
I assume you are missing the point. pod-2 does not care whether pod-1 is down or not. pod-1 might get restarted and remain leader (if get back up real quick). pod-2 is going to check whether the lease record was updated or not. If the lease record has not been updated for --leader-elect-lease-duration after the last update, then pod-2 is going to try to apply lock.
Yes I understand this. But let’s say pod-1 is really down and won’t get back up.
pod-2 is going to do it’s due diligence ( check update time, etc ) then decided to acquire the lease.
pod-3 is doing the same thing at the same time right? So it will try to acquire the lease too.
In the start, Lease holder is pod-1 but expired
pod-2 acquires the lease (Lease holder: pod-2)
pod-3 acquires the lease in like…1ms after (Lease holder: pod-3)
pod-2 thinks it has the lease. Now it do the leader thing.
pod-3 thinks it has the lease too. Now it do the leader thing.
Now we ended up with 2 pods thinking it’s the leader. Which should be prevented.
Actually, at step 4 and step 5, here is what will happen
a. pod-2 and pod-3 both get the lease with resourceVersion X
b. pod-2 and pod-3 both update the lease with holderIdentity and resourceVersion X
But, only one of the updates will go through
If pod-2 updated first, resourceVersion: X->Y, holderIdentity: pod-2, which means pod-3 update fails
If pod-3 updated first, resourceVersion: X->Y, holderIdentity: pod-3, which means pod-2 update fails
So, this still works out as designed.
But, @penguin I completely agree with you that lease update is not intuitive.
I spent about 2 days trying to understand why the updates are going through as you mentioned, I would have expected that k8s would reject the updates (if --force was not used), may be an enhancement for future.
If that’s the case how do I simulate this? As I mentioned in the very first post I should be able to simulate this using kubectl.
I wanted to make sure at any given time there should only be 1 (or no) leader. If two database instance simultaneously think it’s the leader and start accepting requests this will cause issue.
Your assumption is partially correct, but Kubernetes does not enforce Lease locking at the API level. The Leader Election mechanism relies on controllers (or applications) respecting the Lease semantics rather than Kubernetes blocking updates. Any pod can technically update the Lease using kubectl, but in a real leader election scenario, clients should check if the Lease is still valid before attempting to acquire it. The locking is enforced at the application level, not by the Kubernetes API itself. To simulate proper behavior, you would need to implement a controller that follows the leader election logic.