Make kubelet changes pesistent on GKE


#1

Does anyone know how to make kubelet changes persistent on GKE? I was able to make changes by adding arguments to /etc/default/kubelet(which set the KUBELET_OPTS) OR /home/kubernetes/kubelet-config.yamlwhich is passed to the kubelet with --config. But after a while the changes are rolled back by GKE. Any help here will be greatly appreciated.

I found this issues related to this but no good answer so far:
https://issuetracker.google.com/issues/118639505
https://issuetracker.google.com/issues/118428580
https://issuetracker.google.com/issues/111408108


#2

There isn’t a general mechanism for making arbitrary permanent changes. That would make long-term support very very difficult and configs evolve.

Can you say more about what specifically you are trying to do? That helps us (wearing my GKE hat) to understand the needs of the GKE product and what we need to better support.


#3

Thanks for replying @thockin

Specifically I’m trying to set cpu-manager-policy static. This is for running a database. Right now the context switches are very bad for performance and cpu-pinning brings a lot of the lost performance back. (You can read my article here: https://www.scylladb.com/2018/08/09/cost-containerization-scylla/) So basically I need to ensure that the proper cpuset-cpu is passed correctly on the docker layer which depends on having the cpu-manager-policy as static on the kubelets.

The steps I’m using right now are these:

kubectl drain $NODE_NAME --force --ignore-daemonsets --delete-local-data --grace-period=60
gcloud compute --project "skilled-adapter-452" ssh --zone "us-east1-b" "$NODE_NAME"
sudo vi /etc/default/kubelet 
  --cpu-manager-policy=static
sudo vi /home/kubernetes/kubelet-config.yaml
  cpuManagerPolicy: static
sudo rm /var/lib/kubelet/cpu_manager_state
sudo systemctl restart kubelet
sudo journalctl -u kubelet
  # Check the results
exit
kubectl uncordon $NODE_NAME

But as I mentioned before, they get rolled back in less than 24h.

I completely understand what you are saying about arbitrary changes and long-term support but I don’t think cpu-manager-policy static is something unreasonable.

Again, thanks for taking the time to help out!


#4

https://issuetracker.google.com/issues/111408108