Hi all,
trust this finds you well. Since a few days - I can’t tell for sure when it started - I’m seeing some strange behavior in some of or deployments, where pods get stuck in the INIT state. The effect does only affect one certain namespace, only a few deployments within that namespace, but from affected deployments not all replicas - some run fine, some have the issue.
The affected pods show the following event:
Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo default-token-f9v2j persistent-storage]: timed out waiting for the condition
I couldn’t find anything meaningful on the Internet. Restarting the pods doesn’t help, neither does restarting or recreating the affected deployments. Even recreating the namespace from scratch does not help. And I have no clue, what has changed since the issue started.
Any hint would be much appreciated
Cluster information:
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7", GitCommit:"132a687512d7fb058d0f5890f07d4121b3f0a2e2", GitTreeState:"clean", BuildDate:"2021-05-12T12:40:09Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.7-eks-d88609", GitCommit:"d886092805d5cc3a47ed5cf0c43de38ce442dfcb", GitTreeState:"clean", BuildDate:"2021-07-31T00:29:12Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}
Cloud being used: AWS EKS
Host OS: Amazon Linux 2 5.4.156-83.273.amzn2.x86_64 docker://20.10.7
Describe-output from failed pod:
Name: mdms-cleanup-worker-v21.6.0-8556cb6f8d-nhb6f
Namespace: mdms-microservices
Priority: 0
Node: ip-100-64-43-90.eu-west-1.compute.internal/100.64.43.90
Start Time: Wed, 24 Nov 2021 09:30:23 +0100
Labels: app=mdms-cleanup-worker
application=mdms
module=microservices
pod-template-hash=8556cb6f8d
security.istio.io/tlsMode=istio
service.istio.io/canonical-name=mdms-cleanup-worker
service.istio.io/canonical-revision=v21.6.0
tier=backend
version=v21.6.0
Annotations: kubectl.kubernetes.io/default-container: mdms-cleanup-worker
kubectl.kubernetes.io/default-logs-container: mdms-cleanup-worker
kubernetes.io/psp: eks.privileged
prometheus.io/path: /stats/prometheus
prometheus.io/port: 15020
prometheus.io/scrape: true
sidecar.istio.io/status:
{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-data","istio-podinfo","istio-token","istiod-...
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/mdms-cleanup-worker-v21.6.0-8556cb6f8d
Init Containers:
istio-init:
Container ID:
Image: gcr.io/istio-release/proxyv2:1.11.4
Image ID:
Port: <none>
Host Port: <none>
Args:
istio-iptables
-p
15001
-z
15006
-u
1337
-m
REDIRECT
-i
*
-x
-b
*
-d
15090,15021,15020
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 10m
memory: 40Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-f9v2j (ro)
Containers:
mdms-cleanup-worker:
Container ID:
Image: 550038091055.dkr.ecr.eu-west-1.amazonaws.com/renewables-uai1007324-mdms-microservices:cleanup-worker_21.12.0
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 100m
memory: 500Mi
Environment:
NATS_SERVER_ADDR: nats://nats.utilities:4222
CLUSTER_ID: ge-mdmswind-production-nats-cluster
MDMS_DATABASE_GATEWAY_ADDR: http://mdms-database-gateway.mdms-microservices:8180
AWS_ACCESSKEY:
AWS_SECRETKEY:
AWS_REGION: eu-west-1
Mounts:
/tmp from persistent-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-f9v2j (ro)
istio-proxy:
Container ID:
Image: gcr.io/istio-release/proxyv2:1.11.4
Image ID:
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--proxyLogLevel=warning
--proxyComponentLogLevel=misc:error
--log_output_level=default:info
--concurrency
2
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 10m
memory: 40Mi
Readiness: http-get http://:15021/healthz/ready delay=1s timeout=3s period=2s #success=1 #failure=30
Environment:
JWT_POLICY: third-party-jwt
PILOT_CERT_PROVIDER: istiod
CA_ADDR: istiod.istio-system.svc:15012
POD_NAME: mdms-cleanup-worker-v21.6.0-8556cb6f8d-nhb6f (v1:metadata.name)
POD_NAMESPACE: mdms-microservices (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
HOST_IP: (v1:status.hostIP)
PROXY_CONFIG: {}
ISTIO_META_POD_PORTS: [
]
ISTIO_META_APP_CONTAINERS: mdms-cleanup-worker
ISTIO_META_CLUSTER_ID: Kubernetes
ISTIO_META_INTERCEPTION_MODE: REDIRECT
ISTIO_META_WORKLOAD_NAME: mdms-cleanup-worker-v21.6.0
ISTIO_META_OWNER: kubernetes://apis/apps/v1/namespaces/mdms-microservices/deployments/mdms-cleanup-worker-v21.6.0
ISTIO_META_MESH_ID: cluster.local
TRUST_DOMAIN: cluster.local
Mounts:
/etc/istio/pod from istio-podinfo (rw)
/etc/istio/proxy from istio-envoy (rw)
/var/lib/istio/data from istio-data (rw)
/var/run/secrets/istio from istiod-ca-cert (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-f9v2j (ro)
/var/run/secrets/tokens from istio-token (rw)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
istio-podinfo:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.labels -> labels
metadata.annotations -> annotations
istio-token:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 43200
istiod-ca-cert:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-ca-root-cert
Optional: false
persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mdms-microservice-temp
ReadOnly: false
default-token-f9v2j:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-f9v2j
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 16m (x6 over 68m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[default-token-f9v2j persistent-storage istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo]: timed out waiting for the condition
Warning FailedMount 14m (x2 over 48m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istio-envoy istio-token istio-podinfo default-token-f9v2j persistent-storage istiod-ca-cert istio-data]: timed out waiting for the condition
Warning FailedMount 12m (x4 over 46m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istio-podinfo default-token-f9v2j persistent-storage istiod-ca-cert istio-data istio-envoy istio-token]: timed out waiting for the condition
Warning FailedMount 7m50s (x6 over 73m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo default-token-f9v2j persistent-storage]: timed out waiting for the condition
Warning FailedMount 5m32s (x7 over 62m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istio-data istio-envoy istio-token istio-podinfo default-token-f9v2j persistent-storage istiod-ca-cert]: timed out waiting for the condition
Warning FailedMount 3m15s (x6 over 57m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[persistent-storage istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo default-token-f9v2j]: timed out waiting for the condition
Warning FailedMount 59s (x2 over 23m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[istio-token istio-podinfo default-token-f9v2j persistent-storage istiod-ca-cert istio-data istio-envoy]: timed out waiting for the condition