The pod that i created is in a pending state is showing this error:
root@ttogpu:~# kubectl describe pod triton-inference-server-5b6c7f889c-f54c6
Name: triton-inference-server-5b6c7f889c-f54c6
Namespace: default
Priority: 0
Service Account: default
Node: <none>
Labels: app=triton-inference-server
pod-template-hash=5b6c7f889c
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/triton-inference-server-5b6c7f889c
Containers:
triton-server:
Image: triton_server:latest
Ports: 8000/TCP, 8001/TCP, 8002/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment:
DP_DISABLE_HEALTHCHECKS: xids
Mounts:
/models from model-repository (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sczwq (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
model-repository:
Type: HostPath (bare host directory volume)
Path: /path/to/host/model/directory
HostPathType:
kube-api-access-sczwq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
nvidia.com/gpu:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m58s default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..
Any suggestions on how to solve this error?