Hi, forgive me for my bad english.
I’m trying to spawn an etcd cluster with 3 pods. I have a 5 servers k8s (3 manager and 5 nodes).
I ha created 3 pods:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
etcd0 1/1 Running 0 20m 10.233.74.21 kube-node2 <none> <none>
etcd1 1/1 Running 0 20m 10.233.73.84 kube-node1 <none> <none>
etcd2 1/1 Running 0 20m 10.233.105.143 kube-master2 <none> <none>
and 3 services to get a valid dns resoluton:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
etcd0 ClusterIP 10.233.32.88 <none> 2379/TCP,2380/TCP 21m etcd_node=etcd0
etcd1 ClusterIP 10.233.8.202 <none> 2379/TCP,2380/TCP 21m etcd_node=etcd1
etcd2 ClusterIP 10.233.60.23 <none> 2379/TCP,2380/TCP 21m etcd_node=etcd2
From etcd0 and etcd2, no problems. They can reach all other nodes without any problems. But etcd1 can not reach any of etc0 end etcd2:
2019-01-16 13:18:23.898150 W | etcdserver: failed to reach the peerURL(http://etcd0:2380) of member cf1d15c5d194b5c9 (Get http://etcd0:2380/version: dial tcp 10.233.32.88:2380: i/o timeout)
2019-01-16 13:18:23.898176 W | etcdserver: cannot get the version of member cf1d15c5d194b5c9 (Get http://etcd0:2380/version: dial tcp 10.233.32.88:2380: i/o timeout)
2019-01-16 13:18:25.898418 W | etcdserver: failed to reach the peerURL(http://etcd2:2380) of member d282ac2ce600c1ce (Get http://etcd2:2380/version: dial tcp 10.233.60.23:2380: i/o timeout)
2019-01-16 13:18:25.898444 W | etcdserver: cannot get the version of member d282ac2ce600c1ce (Get http://etcd2:2380/version: dial tcp 10.233.60.23:2380: i/o timeout)
2019-01-16 13:18:28.497224 W | rafthttp: health check for peer cf1d15c5d194b5c9 could not connect: dial tcp 10.233.32.88:2380: i/o timeout
2019-01-16 13:18:28.501596 W | rafthttp: health check for peer d282ac2ce600c1ce could not connect: dial tcp 10.233.60.23:2380: i/o timeout
etcd1 is actually on kube-node1, but if I delete all pods and recreate them, it is another pod on another node that have a problem. I I try to use the pods IP from etcd1, it works, but it is not a good solution.
[EDIT] I have tested some combinaisons of nodes and it seems that pods en kube-node1 and jube-master1 encounter a connectivity problem. I don’t know why …
k8s version: 1.13.0
Any idea ??