Kubectl super slow after bigsur upgrade

Cluster information:

Kubernetes version: several, mostly in the 1.17.17-gke.3700 ballpark.
Cloud being used: Google
Installation method: GKE Terraform provider
Host OS: google container os
CNI and version:
CRI and version:

Client info

  • 2019 Macbook
  • 2.4ghz 8core intel i9
  • Sophos Endpoint enabled
  • Mac OS Big Sur 11.3
  • kubectl version: 1.21 via brew

General overview

I recently updated to Big Sur on thursday last week. After upgrading I have observed that a ‘cold’ kubectl can take well over a minute. Worse, it triggers the beachball – stalling other apps, audio, etc.

It seems like kubectl is really slow on MacOS BigSur · Issue #1020 · kubernetes/kubectl · GitHub (and the linked "kubectl get <any-resource>" is very slow due to custom/external metrics · Issue #98801 · kubernetes/kubernetes · GitHub) discuss some sort of caching issue.

I’m following those issues now, but curious if anyone else experiences the beachball? Are there any workarounds folks have found?

I’d be slightly frustrated by a slow kubectl, but the whole stall of the OS is enough to consider downgrade back to catalina :frowning:

I’m having the same issue and I suspect a common denominator might be Sophos. I have found a workaround though, use --cache-dir=/dev/null in your kubectl commands (I suggest adding it to your .bashrc or .zshrc as an alias for kubectl). You can also softlink ~/.kube/cache and ~/.kube/http-cache to /dev/null. All of your commands will take a little longer because it will never have a cache to work from (instead of the couple hundred milliseconds for normal cached kubectl commands) but your system will remain responsive. My commands take just over a second on kubectl 1.16 which is a huge relief from 50 seconds or more of waiting for an unresponsive system. On 1.18, it takes about 10 seconds due to the additional API calls that version of kubectl makes.

I have even tried with a RAMDrive and the same poor performance was seen, which leads me to believe it’s not the storage medium, but the file IO itself (which I imagine Sophos has hooked into).

Client info

  • 2020 Macbook
  • 2ghz quad-core intel i5
  • Sophos Endpoint enabled
  • Mac OS Big Sur 11.4
  • kubectl version: v1.16.15 binary release (I’ve tried every minor version up to 1.20, same results. 1.16 is the quickest because it makes fewer API calls.)
1 Like

A coworker uninstalled sophos and the problem went away for him. I’ve gotten a fresh install without it as well and its been much better.

We’ve been trying to work with sophos but we’re a relatively small group so I don’t think we’ll make a ton of progress.

Good luck :frowning:

Just chiming in with the same exact problem. Disabling the Sophos com.sophos.endpoint.networkextension and com.sophos.endpoint.scanextension extensions brought everything back to normal for me, although YMMV with your local IT department. Smells like a problem with Sophos itself.

I’ve also ran into this issue, but you don’t have to completely disable Sophos to fix it. You can just exclude the ~/.kube/cache and ~/.kube/http-cache directories from the scans.
It’s under: Preferences → Protection → General → Exceptions. And there you can add them under Exceptions.

Kubectl was running really slow for me - this is what helped me speed up the command.

On my local instance - I followed these steps:

rm -rf ~/.kube/cache
mkdir /dev/shm/kube_cache
ln -s /dev/shm/kube_cache ~/.kube/cache

I think it could be that my hdd was slow and kubectl is trying to read a lot of files on every command – you can see this in action if you run any kubectl command with a high verbosity – -v 8 – this basically moves your kubectl local cache to the shared memory (RAM). (Please do your due diligence before doing stuff)