Is there prior art/discussions on optionally ignoring readiness probes when some fraction of replicas are not ready?

It seems like a common pattern in service development would be for a readiness probe to check connectivity, authentication, etc to a dependent service. This is useful if trying to automatically limit the impact of a stale credential in some pods, network partition affecting only some nodes, etc.

However, doing this is of course also prone to leading all replicas to report not ready and being removed from the load balancer due to an outage of the dependent service they are testing. This situation guarantees zero availability.

Some scenarios exist where a replica might want to request removal from the load balancer, but only if it is an outlier among its peers in the deployment. For example, if the dependent service is required to serve some but not all requests to our service, it might be desirable to remove ourselves from handling additional requests only when this replica is uniquely experiencing the issue – otherwise, if the dependent service outage is affecting most or all of our peer replicas, we would rather remain in the pool so that we can still serve those requests where we don’t need the failed dependent service.

One solution to this could be to put the onus on each application to watch as peer pods come and go, and check each others’ readiness probes (or fancier schemes) to identify whether failures observed on our replica are outlier events. But I am not finding libraries to help with this.

Theoretically, wouldn’t a simpler solution be to add a configuration option to readiness probes specifying that pods failing their readiness checks should be removed from the load balancer until N percent of running pods in the replicaset are failing that probe, at which point that probe is disregarded?

Is this a sensible idea, or is there some obvious justification for not doing this that explains why it does not exist?

It’s a little more complicated. It’s totally valid for a deployment to have multiple replica sets active for an extended period of time. It’s totally valid for a service to span multiple deployments for an indefinite period of time.

That doesn’t make this impossible, just not as easy as it sounds. A better approach might be in the endpoint slice logic, which is the result of all of the above, and already has fields indicating readiness (“serving”).

It sounds like you think the general concept of having kubernetes decide when to disregard a readiness probe based on what’s currently happening on related pods isn’t nuts, but the interesting question is what the population of related pods ought to consist of?