Hi folks,
I work for EKS and I’m thinking about how to make cluster upgrade experience smoother.
EKS provides a simple platform for creating clusters in AWS. Outside EKS, this question is relevant for any platform team that provides tools for operators to create clusters and run applications.
In the simplest model , the platform team(aka EKS) does not enforce any special restriction around webhooks. Users can create and deploy webhooks that could potentially be breaking at any time.
When platform operators upgrade the cluster to recent k8s versions, a failing webhook can stop the process. This can happen because apiserver creates new resources that get added as part of new versions.
How do platform teams deal with this situation? I feel platform operators could add a constraint to not allow webhooks that work on all(*) resources. Are there other mechanisms operators can use to prevent breaking webhooks from causing cluster level disruption?
Cluster information:
Kubernetes version: Any
Cloud being used: EKS
Installation method:
Host OS: AL2
CNI and version: ANY
CRI and version: ANY
You can format your yaml by highlighting it and pressing Ctrl-Shift-C, it will make your output easier to read.