Cdf-api-service not starting on all Kubernetes master and worker nodes


#1

Hello Experts,

We are implementing Kubernetes cluster with three masters and three worker nodes for deploying Microfocus ITOM CDF. The CDF is successfully installed but cdf-apiserver pod is in CrashLoopBackOff state on all the servers. Also, the vault is results in below error.

vault status

Error checking seal status: Get https://127.0.0.1:8200/v1/sys/seal-status: x509: certificate is valid for 172.17.17.1, 10.5.4.140, not 127.0.0.1

Please find below logs from the failed pods and provide your inputs.

[root@~]# kubectl logs cdf-apiserver-659f8fbd4c-r7lhg -n core -c dependence
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
nc: bad address ‘kubernetes-vault’
waiting for kubernetes-vault
kubernetes-vault (172.16.21.6:8899) open

Thanks & Regards,
Mayur


#2

Can you post more information? It’s difficult to troubleshoot things like CrashLoopBackOff without further logs, deployment specs etc.

As far the vault error goes. You’re connecting to an address that the cert is not signed for.

Get https://127.0.0.1:8200/v1/sys/seal-status: x509: certificate is valid for 172.17.17.1, 10.5.4.140, not 127.0.0.1

I’d suggest signing the cert for the internal dns name (kubernetes-vault and kubernetes-vault.<namespace>.svc.cluster.local).


#3

Hello,

Thanks for your response.

Can you please provide the steps to sign certs for kubernetes vault? Please find below describe & logs from the failed cdf-apiserver pod.

kubectl describe pod cdf-apiserver-659f8fbd4c-8b69d -n core
Name: cdf-apiserver-659f8fbd4c-8b69d
Namespace: core
Priority: 0
PriorityClassName:
Node: FQDN/ 10.5.4.143
Start Time: Thu, 14 Mar 2019 17:00:43 +0300
Labels: app=suite-installer-app
pod-template-hash=2159496807
role=cdf-podpreset
Annotations: pod.boostport. com/vault-approle=core-baseinfra
pod.boostport. com/vault-init-container=install
podpreset.admission.kubernetes.io/podpreset-itom-cdf-podpreset=301
Status: Running
IP: 172.16.41.15
Controlled By: ReplicaSet/cdf-apiserver-659f8fbd4c
Init Containers:
dependence:
Container ID: docker://b5f9fc804af10e397895c3369732612f6b5850268f1477714f8f079aa6363a33
Image: localhost:5000/hpeswitom/itom-busybox:1.28.3-0017
Image ID: docker://sha256:1b8f10897cf90ecb12b20ee01b3072bf9c89b8336398137b2736792e9b5f00ef
Port:
Host Port:
Command:
sh
-c
until nc -vz kubernetes-vault 8899 -w 10; do echo waiting for kubernetes-vault; sleep 2; done;
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 14 Mar 2019 17:00:44 +0300
Finished: Thu, 14 Mar 2019 17:00:44 +0300
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vhk7b (ro)
install:
Container ID: docker://bdaef2c9f356d543ad209a6ba976c55772e12904a0a8097d9fcd36f9415c8063
Image: localhost: 5000/kubernetes-vault-init:0.5.0-0030
Image ID: docker://sha256:caa9f5c8113e6c961db3aa0781cb08c1fff954f7c4531a6a62b67780f36936d9
Port:
Host Port:
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 14 Mar 2019 17:00:46 +0300
Finished: Thu, 14 Mar 2019 17:01:04 +0300
Ready: True
Restart Count: 0
Environment:
VAULT_ROLE_ID: 863babfb-0f4d-70ef-1f7b-1c9e60ac4ec6
CERT_COMMON_NAME: Realm:RIC,Common_Name:cdf-svc.core,File_Name:token
Mounts:
/var/run/secrets/boostport.com from vault-token (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vhk7b (ro)
Containers:
cdf-apiserver:
Container ID: docker://d8e49f9dfdfdab73742220b095adde3f5c074eef48d6f70e4104fea00a5c6c2c
Image: localhost: 5000/itom-cdf-apiserver:1.1.0-00401
Image ID: docker://sha256:7227a738affc2490aa58601d4438cfa3d689c533920c1126a17cb698cfe38d02
Ports: 8080/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Sun, 17 Mar 2019 11:22:49 +0300
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 17 Mar 2019 11:16:19 +0300
Finished: Sun, 17 Mar 2019 11:17:38 +0300
Ready: True
Restart Count: 614
Limits:
cpu: 2
memory: 2Gi
Requests:
cpu: 100m
memory: 1Gi
Liveness: http-get http://: 8080/urest/v1.1/healthz delay=900s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://: 8080/urest/v1.1/healthz delay=0s timeout=5s period=5s #success=1 #failure=3
Environment Variables from:
images-configmap ConfigMap Optional: false
base-configmap ConfigMap Optional: false
addnode-configmap ConfigMap Optional: false
Environment:
API_SERVER_HOST: kubernetes.default
API_SERVER_PORT: 443
SYSTEM_NAMESPACE: core
VAULT_ADDR: https:// 10.5.4.140:8200
ETCD_TLS_PEM_FILE: /var/run/secrets/boostport.com/keystore.p12
APPROLE: baseinfra
ROLE_ID: 863babfb-0f4d-70ef-1f7b-1c9e60ac4ec6
NAMESPACE: core
IDM_SERVER: idm-svc.core
REGISTRY_SERVER: kube-registry.core
API_REGISTRY_PORT: 5000
MASTERNODE_TIME_ZONE: Asia/Qatar
MULTI_SUITE: 0
BUILD_NUM: 00104
IDM_SVC_SERVICE_PORT: 443
HPSSO_INIT_STRING_KEY: HPSSO_INIT_STRING_KEY
VAULT_SIGNING_KEY: VAULT_SIGNING_KEY
IDM_TRANSPORT_USER_NAME: transport_admin
IDM_TRANSPORT_USER_PASSWORD_KEY: idm_transport_admin_password
IDM_ADMIN_USER_NAME: integration_admin
IDM_ADMIN_USER_PASSWORD_KEY: idm_integration_admin_password
K8S_HOME_CM: <set to the key ‘K8S_HOME’ of config map ‘base-configmap’> Optional: false
K8S_MASTER_IP: <set to the key ‘API_SERVER’ of config map ‘base-configmap’> Optional: false
SUITE_ADMIN_NAME: itsma_admin,opsbridge_admin,hcm_admin,dca_admin,nom_admin,demo_admin,integration_admin
RETRY_TIMES: 360
JVM_OPTION: -Xmx1024m -Xms1024m
CDF_GROUP_LIST: Administrators,SuiteAdministrators,superIDMAdmins
CDF_ADMIN_GROUP: Administrators
MY_NODE_NAME: (v1:spec.nodeName)
MY_POD_NAME: cdf-apiserver-659f8fbd4c-8b69d (v1:metadata.name)
MY_POD_NAMESPACE: core (v1:metadata.namespace)
MY_POD_IP: (v1:status.podIP)
MY_CONTAINER_NAME: cdf-apiserver
SSH_CONNECT_TIMEOUT_IN_SECONDS: 120
SSH_IO_TIMEOUT_IN_SECONDS: 120
VAULT_CA_FILE: /var/run/secrets/boostport.com/ca.crt
VAULT_CERT_FILE: /var/run/secrets/boostport.com/token.crt
VAULT_KEY_FILE: /var/run/secrets/boostport.com/token.key
Mounts:
/cdf-phase2.json from cdf-json (ro)
/install_download_zip from install-download-zip (rw)
/jdbc from jdbc-dir (rw)
/opt/kubernetes/objectdefs from yaml-dir (rw)
/pv_suite_install_tmp from suite-metadata (rw)
/var/log/cdf-deployments/core from log-dir (rw)
/var/run/secrets/boostport.com from vault-token (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vhk7b (ro)
kubernetes-vault-renew:
Container ID: docker://b2f54237f50b5d20466fd6ffb4268b242ecd082840789d416dcb32ba74bc3a37
Image: localhost:5000/kubernetes-vault-renew:0.5.0-0030
Image ID: docker://sha256:3718c526e86eafa1496908aa4eddd6e1e551c383190bb3ca8bbdf44ace812109
Port:
Host Port:
State: Running
Started: Thu, 14 Mar 2019 17:01:06 +0300
Ready: True
Restart Count: 0
Environment Variables from:
images-configmap ConfigMap Optional: false
base-configmap ConfigMap Optional: false
addnode-configmap ConfigMap Optional: false
Environment:
Mounts:
/var/run/secrets/boostport.com from vault-token (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vhk7b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
suite-metadata:
Type: HostPath (bare host directory volume)
Path: /opt/kubernetes/cfg/suite-metadata
HostPathType:
yaml-dir:
Type: HostPath (bare host directory volume)
Path: /opt/kubernetes/objectdefs/yaml_template
HostPathType:
cdf-json:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: cdf-conf-file
Optional: false
jdbc-dir:
Type: HostPath (bare host directory volume)
Path: /opt/kubernetes/tools/drivers/jdbc
HostPathType:
install-download-zip:
Type: HostPath (bare host directory volume)
Path: /opt/kubernetes/tools/install_download_zip
HostPathType:
log-dir:
Type: HostPath (bare host directory volume)
Path: /var/log/cdf-deployments/core
HostPathType:
vault-token:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
default-token-vhk7b:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vhk7b
Optional: false
QoS Class: Burstable
Node-Selectors: master=true
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Warning Unhealthy 33m (x6332 over 2d) kubelet, FQDN Readiness probe failed: Get http://172.16.41.15:8080/urest/v1.1/healthz: dial tcp 172.16.41.15:8080: connect: connection refused
Warning BackOff 2m (x14390 over 2d) kubelet, FQDN Back-off restarting failed container

Thanks & Regards,
Mayur


#4

It sorta depends on how you generated the certs. They touch upon this a bit in the docs for kubernetes-vault.

I’d suggest going through their quick-start guide to get an idea of the flow and how it all ties together.


#5

I’ll go through it in detail. Can you please check the provided logs from failed pod?

Thanks!