Hear ye, hear ye! It’s the third Wednesday of the month on the 17th, you know what that means, it’s time for office hours, our monthly livestream where we answer your user questions with a panel of experts, so if you’re stuck with something, then come on by and let us take a stab at it:
my question(s) is/are; how does everyone stay up to date on kubernetes or other work topics, how much time do you spend a week, during office hours? (pun intended) and how do you combine it with your family
Hello, we are having a hard time running k8s jobs that use a 30 GB image. The image takes up to 50 minutes to be pulled on a newly created node. Reducing the image size is not an option and the pull time is bound by decompression, not download, so local registries won’t help. We use autoscaling to allocate resources on demand but this is irritating because running a job that takes 5 minutes will take almost one hour if the node is newly created. Also, having a node up with a prepulled image waiting for jobs is very expensive for us because the nodes has GPUs. Any way to deal with that? Thanks in advance
General concensus seems to be to put the data in a volume so you mount that when the nodes come up instead of copying it over to every new node.
Question: My question relates to one of the things that I noticed in the setup of our clusters, namely that each application has its own namespace, named after the application itself.
There’s no way to get all pods across several namespaces in a single command
For each app I have in mind I need to type out its particular namespace
I can’t use kubectl config set-context --current --namespace to alleviate some of the typing
Question: I create a pod and then shutdown the node which runs that pod, the “kubectl get pods” shows its status is “running”, why? Shouldn’t it be dead? And sometimes it would change status to “terminating”, and keep that status.
Question: My kubernetes cluster is deploying the pod in the node which is not in my cluster. Warning FailedScheduling default-scheduler node “ip-10-0-.ec2.internal” not found in cache the node is not in the cluster but still it is picking the node which is not in the cluster for deployment