(help needed) unable to init cluster - fails on CRI Upload


#1

Cross posting:
[https://serverfault.com/questions/952258/kubernetes-crisocket-information-upload-fails-with-node-not-found](http://Serverfault link)

I am using the following kubeadmin config with external etcd setup for HA kubernetes setup following Creating Highly Available Clusters with kubeadm - Kubernetes in bare metal server with centos7.

etcd version - v3.2.26

kind: ClusterConfiguration
kubernetesVersion: v1.13.1
apiServer:
  certSANs:
  - "k8-master01.loc.prov.domain.tld"
controlPlaneEndpoint: "k8-master01.loc.prov.domain.tld:8080"
etcd:
    external:
        endpoints:
        - https://k8-master01.loc.prov.domain.tld:2379
        - https://k8-master02.loc.prov.domain.tld:2379
        - https://k8-master03.loc.prov.domain.tld:2379
        caFile: /etc/kubernetes/pki/etcd/ca.crt
        certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
        keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key

However init keeps failing at following step:

I0204 15:04:24.985393  142883 uploadconfig.go:133] [upload-config] Preserving the CRISocket information for the control-plane node
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "k8-master01.loc.prov.domain.tld" as an annotation
I0204 15:04:25.485719  142883 round_trippers.go:419] curl -k -v -XGET  -H "User-Agent: kubeadm/v1.13.1 (linux/amd64) kubernetes/eec55b9" -H "Accept: application/json, */*" 'https://k8-master01.loc.prov.domain.tld:8080/api/v1/nodes/k8-master01.loc.prov.domain.tld'
I0204 15:04:25.488810  142883 round_trippers.go:438] GET https://k8-master01.loc.prov.domain.tld:8080/api/v1/nodes/k8-master01.loc.prov.domain.tld 404 Not Found in 3 milliseconds

It keeps retrying and then eventually times out.

error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition

( Hostname has been fuzzed in above log ) - full logs @ https://gist.github.com/anshprat/08465e64a98327b6005abf3645551eeb

The nodes list is empty.

{
  "kind": "NodeList",
  "apiVersion": "v1",
  "metadata": {
    "selfLink": "/api/v1/nodes/",
    "resourceVersion": "57987"
  },
  "items": []
}

Any suggestions as to how we can proceed?
Also, let me know if this is not the right place for this.


#2

This turned out to be an issue w/ haproxy load balancer to api-server cert issue.
Fixing the cert issues solved this problem.