Can't start pods after a disk pressure problem

I have lost the ability to run pods on my microk8s install. After creating the pod I see the event for pulling the image but it never gives any indication of any progress.

I had a problem earlier where I couldn’t start pods due to disk pressure. I cleared some space on the server. I attempted at one point to prune unused images from the registry but I don’t think I got anywhere with that.

I don’t see any obvious errors in microk8s services in the system journal.

To demonstrate I created the following pod:

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
  labels:
    app: ubuntu
spec:
  containers:
  - image: ubuntu
    command:
      - "sleep"
      - "604800"
    imagePullPolicy: IfNotPresent
    name: ubuntu
  restartPolicy: Always

Describing the pod after creating it:

Name:         debug
Namespace:    default
Priority:     0
Node:         ubunutu-server/10.41.1.95
Start Time:   Sat, 14 Nov 2020 19:20:24 +0000
Labels:       app=debug
Annotations:  cni.projectcalico.org/podIP: 10.1.24.129/32
              cni.projectcalico.org/podIPs: 10.1.24.129/32
Status:       Pending
IP:           
IPs:          <none>
Containers:
  ubuntu:
    Container ID:  
    Image:         ubuntu
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      604800
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-hzbq4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-hzbq4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-hzbq4
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  8m26s  default-scheduler  Successfully assigned default/debug to ubunutu-server
  Normal  Pulling    8m25s  kubelet            Pulling image "ubuntu"

That “Pulling image” event is the last event I ever get.

It is hard to say what may be happening in the cluster. Could you share the microk8s inspect tarball (maybe in a github issue at [1])?

[1] https://github.com/ubuntu/microk8s/issues

The problem cleared up after another restart of microk8s. I stopped investigating after that.

Thank you for the reply.

I did not have a disk pressure issue but I saw something similar where image pulls just hung on container init. Stop/start of microk8s also fixed my problem. I attached some inspection tarballs to this issue: Pull after 1h doesn't work · Issue #1113 · ubuntu/microk8s · GitHub