Istio-pilot error when deploying kubeflow

Hi everyone!

I want to install kubeflow with gpu support so I adapted to Canonical tutorial and installed (MicroK8s v1.29.4 revision 6809, classic) with nvidia addon.

When I try to deploy kubeflow with :

juju deploy kubeflow --trust  --channel=1.8/stable

the deployment gets stuck because of an error with the unit “istio-pilot/0” (see below)
juju status
Model Controller Cloud/Region Version SLA Timestamp
kubeflow uk8sx my-k8s/localhost 3.4.5 unsupported 16:36:15+02:00

App Version Status Scale Charm Channel Rev Address Exposed Message
admission-webhook active 1 admission-webhook 1.8/stable 301 10.152.183.52 no
argo-controller active 1 argo-controller 3.3.10/stable 424 10.152.183.46 no
dex-auth active 1 dex-auth 2.36/stable 422 10.152.183.22 no
envoy res:oci-image@cc06b3e active 1 envoy 2.0/stable 194 10.152.183.103 no
istio-ingressgateway active 1 istio-gateway 1.17/stable 1000 10.152.183.154 no
istio-pilot waiting 1 istio-pilot 1.17/stable 1011 10.152.183.125 no installing agent
jupyter-controller active 1 jupyter-controller 1.8/stable 849 10.152.183.63 no
jupyter-ui active 1 jupyter-ui 1.8/stable 858 10.152.183.244 no
katib-controller res:oci-image@31ccd70 active 1 katib-controller 0.16/stable 576 10.152.183.226 no
katib-db 8.0.36-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/stable 153 10.152.183.223 no
katib-db-manager active 1 katib-db-manager 0.16/stable 539 10.152.183.76 no
katib-ui active 1 katib-ui 0.16/stable 422 10.152.183.87 no
kfp-api active 1 kfp-api 2.0/stable 1283 10.152.183.55 no
kfp-db 8.0.36-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/stable 153 10.152.183.84 no
kfp-metadata-writer active 1 kfp-metadata-writer 2.0/stable 334 10.152.183.91 no
kfp-persistence active 1 kfp-persistence 2.0/stable 1291 10.152.183.85 no
kfp-profile-controller active 1 kfp-profile-controller 2.0/stable 1315 10.152.183.142 no
kfp-schedwf active 1 kfp-schedwf 2.0/stable 1466 10.152.183.119 no
kfp-ui active 1 kfp-ui 2.0/stable 1285 10.152.183.184 no
kfp-viewer active 1 kfp-viewer 2.0/stable 1317 10.152.183.56 no
kfp-viz active 1 kfp-viz 2.0/stable 1235 10.152.183.31 no
knative-eventing active 1 knative-eventing 1.10/stable 353 10.152.183.47 no
knative-operator active 1 knative-operator 1.10/stable 328 10.152.183.111 no
knative-serving active 1 knative-serving 1.10/stable 409 10.152.183.237 no
kserve-controller active 1 kserve-controller 0.11/stable 573 10.152.183.218 no
kubeflow-dashboard active 1 kubeflow-dashboard 1.8/stable 582 10.152.183.194 no
kubeflow-profiles active 1 kubeflow-profiles 1.8/stable 355 10.152.183.232 no
kubeflow-roles active 1 kubeflow-roles 1.8/stable 187 10.152.183.239 no
kubeflow-volumes res:oci-image@2261827 active 1 kubeflow-volumes 1.8/stable 260 10.152.183.251 no
metacontroller-operator active 1 metacontroller-operator 3.0/stable 252 10.152.183.143 no
minio res:oci-image@1755999 active 1 minio ckf-1.8/stable 278 10.152.183.53 no
mlmd res:oci-image@44abc5d active 1 mlmd 1.14/stable 127 10.152.183.157 no
oidc-gatekeeper active 1 oidc-gatekeeper ckf-1.8/stable 350 10.152.183.57 no
pvcviewer-operator active 1 pvcviewer-operator 1.8/stable 30 10.152.183.188 no
seldon-controller-manager active 1 seldon-core 1.17/stable 664 10.152.183.24 no
tensorboard-controller active 1 tensorboard-controller 1.8/stable 257 10.152.183.89 no
tensorboards-web-app active 1 tensorboards-web-app 1.8/stable 245 10.152.183.28 no
training-operator active 1 training-operator 1.7/stable 347 10.152.183.248 no

Unit Workload Agent Address Ports Message
admission-webhook/0* active idle 10.1.14.122
argo-controller/0* active idle 10.1.14.97
dex-auth/0* active idle 10.1.14.82
envoy/0* active idle 10.1.14.136 9090,9901/TCP
istio-ingressgateway/0* active idle 10.1.14.118
istio-pilot/0* error idle 10.1.14.65 hook failed: “ingress-relation-created”
jupyter-controller/0* active idle 10.1.14.66
jupyter-ui/0* active idle 10.1.14.123
katib-controller/0* active idle 10.1.14.140 443,8080/TCP
katib-db-manager/0* active idle 10.1.14.91
katib-db/0* active idle 10.1.14.79 Primary
katib-ui/0* active idle 10.1.14.100
kfp-api/0* active idle 10.1.14.119
kfp-db/0* active idle 10.1.14.86 Primary
kfp-metadata-writer/0* active idle 10.1.14.121
kfp-persistence/0* active idle 10.1.14.99
kfp-profile-controller/0* active idle 10.1.14.93
kfp-schedwf/0* active idle 10.1.14.101
kfp-ui/0* active idle 10.1.14.67
kfp-viewer/0* active idle 10.1.14.90
kfp-viz/0* active idle 10.1.14.68
knative-eventing/0* active idle 10.1.14.89
knative-operator/0* active idle 10.1.14.109
knative-serving/0* active idle 10.1.14.102
kserve-controller/0* active idle 10.1.14.107
kubeflow-dashboard/0* active idle 10.1.14.95
kubeflow-profiles/0* active idle 10.1.14.106
kubeflow-roles/0* active idle 10.1.14.114
kubeflow-volumes/0* active idle 10.1.14.134 5000/TCP
metacontroller-operator/0* active idle 10.1.14.103
minio/0* active idle 10.1.14.142 9000-9001/TCP
mlmd/0* active idle 10.1.14.139 8080/TCP
oidc-gatekeeper/0* active idle 10.1.14.84
pvcviewer-operator/0* active idle 10.1.14.98
seldon-controller-manager/0* active idle 10.1.14.110
tensorboard-controller/0* active idle 10.1.14.117
tensorboards-web-app/0* active idle 10.1.14.96
training-operator/0* active idle 10.1.14.112

Here some other info :
juju debug-log --replay --include=istio-pilot
unit-istio-pilot-0: 16:33:29 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:34:10 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:34:10 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:34:10 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:35:13 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:35:51 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:36:54 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:38:29 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:38:30 INFO unit.istio-pilot/0.juju-log ingress:20: HTTP Request: GET https://10.152.183.1/api/v1/namespaces/kubeflow/services/istio-ingressgateway-workload “HTTP/1.1 200 OK”
unit-istio-pilot-0: 16:38:30 ERROR unit.istio-pilot/0.juju-log ingress:20: Uncaught exception while in charm code:
Traceback (most recent call last):
File “./src/charm.py”, line 1203, in
main(Operator)
File “/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py”, line 540, in main
manager = _Manager(
File “/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py”, line 424, in init
self.charm = self._make_charm(self.framework, self.dispatcher)
File “/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py”, line 427, in _make_charm
charm = self._charm_class(framework)
File “./src/charm.py”, line 116, in init
cert_subject=self._cert_subject,
File “./src/charm.py”, line 474, in _cert_subject
svc_address = _get_gateway_address_from_svc(svc)
File “./src/charm.py”, line 1057, in _get_gateway_address_from_svc
gateway_address = _get_address_from_loadbalancer(svc)
File “./src/charm.py”, line 1072, in _get_address_from_loadbalancer
if len(ingresses) != 1:
TypeError: object of type ‘NoneType’ has no len()
unit-istio-pilot-0: 16:38:30 ERROR juju.worker.uniter.operation hook “ingress-relation-created” (via hook dispatching script: dispatch) failed: exit status 1
unit-istio-pilot-0: 16:38:30 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook
unit-istio-pilot-0: 16:41:04 INFO juju.worker.uniter awaiting error resolution for “relation-created” hook

I do not find any solution on other forum. What can I do to start the istio-pilot unit ?

More than likely you don’t have anything enabled or installed that allows the LoadBalancer type for services. When juju is trying to get the ip of the load balancer, it retrieves the service successfully but fails to iterate over the load balancer ip. Hence the object of type ‘NoneType’ has no len() error. Enable the metallb addon and it will resolve the issue.