Can't bring kubernetes cluster back to live

HI!
I’ve got kubernetes cluster as heritage and now need to start it somehow. I don’t know how it was actually created, but suppose that it was done via kubeadm. I noticed that certificates were outdated and I renewed them by:

kubeadm alpha certs renew all

Now I have etcd and kupe-api servers not working (starting and then stopping).

docker ps -a

docker logs api-server:

Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I1119 15:50:49.995997       1 server.go:560] external host was not specified, using 192.168.1.3
I1119 15:50:49.996210       1 server.go:147] Version: v1.15.2
I1119 15:50:50.410114       1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I1119 15:50:50.410138       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
E1119 15:50:50.410654       1 prometheus.go:55] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410689       1 prometheus.go:68] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410709       1 prometheus.go:82] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410727       1 prometheus.go:96] failed to register workDuration metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410753       1 prometheus.go:112] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410778       1 prometheus.go:126] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410804       1 prometheus.go:152] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410825       1 prometheus.go:164] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410866       1 prometheus.go:176] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410900       1 prometheus.go:188] failed to register work_duration metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410924       1 prometheus.go:203] failed to register unfinished_work_seconds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410934       1 prometheus.go:216] failed to register longest_running_processor_microseconds metric admission_quota_controller: duplicate metrics collector registration attempted
I1119 15:50:50.410952       1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I1119 15:50:50.410962       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
I1119 15:50:50.412410       1 client.go:354] parsed scheme: ""
I1119 15:50:50.412424       1 client.go:354] scheme "" not registered, fallback to default scheme
I1119 15:50:50.412471       1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0  <nil>}]
I1119 15:50:50.412508       1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W1119 15:50:50.412748       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I1119 15:50:51.407920       1 client.go:354] parsed scheme: ""
I1119 15:50:51.407942       1 client.go:354] scheme "" not registered, fallback to default scheme
I1119 15:50:51.407974       1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0  <nil>}]
I1119 15:50:51.408003       1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W1119 15:50:51.408241       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:51.412833       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:52.408397       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:52.987076       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:54.057294       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:55.505474       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:56.454869       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:00.038449       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:00.372682       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:06.682062       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:07.463691       1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I1119 15:51:10.412593       1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: []
F1119 15:51:10.412600       1 storage_decorator.go:57] Unable to create storage backend: config (&{ /registry {[https://127.0.0.1:2379] /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/etcd/ca.crt} true 0xc0005b90e0 apiextensions.k8s.io/v1beta1 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:2379: connect: connection refused)
W1119 15:51:10.412741       1 asm_amd64.s:1337] Failed to dial 127.0.0.1:2379: context canceled; please retry.

docker logs etcd:

2020-11-19 15:10:01.351661 I | etcdmain: etcd Version: 3.3.10
2020-11-19 15:10:01.351731 I | etcdmain: Git SHA: 27fc7e2
2020-11-19 15:10:01.351735 I | etcdmain: Go Version: go1.10.4
2020-11-19 15:10:01.351738 I | etcdmain: Go OS/Arch: linux/amd64
2020-11-19 15:10:01.351742 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2020-11-19 15:10:01.351795 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2020-11-19 15:10:01.351815 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-11-19 15:10:01.352371 I | embed: listening for peers on https://192.168.1.3:2380
2020-11-19 15:10:01.352404 I | embed: listening for client requests on 127.0.0.1:2379
2020-11-19 15:10:01.352426 I | embed: listening for client requests on 192.168.1.3:2379
2020-11-19 15:10:01.360515 W | snap: skipped unexpected non snapshot file tmp527778953
2020-11-19 15:10:01.362871 I | etcdserver: recovered store from snapshot at index 104718355
2020-11-19 15:10:01.364214 I | mvcc: restore compact to 87110378
2020-11-19 15:10:01.376240 I | etcdserver: name = k8s-master-01
2020-11-19 15:10:01.376263 I | etcdserver: data dir = /var/lib/etcd
2020-11-19 15:10:01.376269 I | etcdserver: member dir = /var/lib/etcd/member
2020-11-19 15:10:01.376272 I | etcdserver: heartbeat = 100ms
2020-11-19 15:10:01.376274 I | etcdserver: election = 1000ms
2020-11-19 15:10:01.376277 I | etcdserver: snapshot count = 10000
2020-11-19 15:10:01.376286 I | etcdserver: advertise client URLs = https://192.168.1.3:2379
2020-11-19 15:10:01.448684 I | etcdserver: restarting member 361c924cbd55a81 in cluster 7e3c896b15fbe02d at commit index 104724993
2020-11-19 15:10:01.448967 I | raft: 361c924cbd55a81 became follower at term 74097
2020-11-19 15:10:01.449001 I | raft: newRaft 361c924cbd55a81 [peers: [361c924cbd55a81,dad85d000dfebf92,e6cf4fe3e32b8396], term: 74097, commit: 104724993, applied: 104718355, lastindex: 104724995, lastterm: 45466]
2020-11-19 15:10:01.449122 I | etcdserver/api: enabled capabilities for version 3.3
2020-11-19 15:10:01.449139 I | etcdserver/membership: added member 361c924cbd55a81 [https://192.168.1.3:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449143 I | etcdserver/membership: added member dad85d000dfebf92 [https://192.168.1.4:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449146 I | etcdserver/membership: added member e6cf4fe3e32b8396 [https://192.168.1.5:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449150 I | etcdserver/membership: set the cluster version to 3.3 from store
2020-11-19 15:10:01.450999 I | mvcc: restore compact to 87110378
2020-11-19 15:10:01.462560 W | auth: simple token is not cryptographically signed
2020-11-19 15:10:01.465418 I | rafthttp: starting peer dad85d000dfebf92...
2020-11-19 15:10:01.465478 I | rafthttp: started HTTP pipelining with peer dad85d000dfebf92
2020-11-19 15:10:01.465710 I | rafthttp: started streaming with peer dad85d000dfebf92 (writer)
2020-11-19 15:10:01.465804 I | rafthttp: started streaming with peer dad85d000dfebf92 (writer)
2020-11-19 15:10:01.465993 I | rafthttp: started peer dad85d000dfebf92
2020-11-19 15:10:01.466018 I | rafthttp: started streaming with peer dad85d000dfebf92 (stream Message reader)
2020-11-19 15:10:01.466035 I | rafthttp: added peer dad85d000dfebf92
2020-11-19 15:10:01.466050 I | rafthttp: starting peer e6cf4fe3e32b8396...
2020-11-19 15:10:01.466058 I | rafthttp: started HTTP pipelining with peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466067 I | rafthttp: started streaming with peer dad85d000dfebf92 (stream MsgApp v2 reader)
2020-11-19 15:10:01.466308 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (writer)
2020-11-19 15:10:01.466483 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (writer)
2020-11-19 15:10:01.466637 I | rafthttp: started peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466650 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (stream Message reader)
2020-11-19 15:10:01.466656 I | rafthttp: added peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466669 I | etcdserver: starting server... [version: 3.3.10, cluster version: 3.3]
2020-11-19 15:10:01.466892 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (stream MsgApp v2 reader)
2020-11-19 15:10:01.469431 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-11-19 15:10:02.949569 I | raft: 361c924cbd55a81 is starting a new election at term 74097
2020-11-19 15:10:02.949607 I | raft: 361c924cbd55a81 became candidate at term 74098
2020-11-19 15:10:02.949631 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74098
2020-11-19 15:10:02.949640 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74098
2020-11-19 15:10:02.949647 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74098
2020-11-19 15:10:03.949545 I | raft: 361c924cbd55a81 is starting a new election at term 74098
2020-11-19 15:10:03.949590 I | raft: 361c924cbd55a81 became candidate at term 74099
2020-11-19 15:10:03.949622 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74099
2020-11-19 15:10:03.949631 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74099
2020-11-19 15:10:03.949641 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74099
2020-11-19 15:10:05.749518 I | raft: 361c924cbd55a81 is starting a new election at term 74099
2020-11-19 15:10:05.749560 I | raft: 361c924cbd55a81 became candidate at term 74100
2020-11-19 15:10:05.749570 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74100
2020-11-19 15:10:05.749579 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74100
2020-11-19 15:10:05.749585 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74100
2020-11-19 15:10:06.466249 W | rafthttp: health check for peer dad85d000dfebf92 could not connect: dial tcp 192.168.1.4:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2020-11-19 15:10:06.466315 W | rafthttp: health check for peer dad85d000dfebf92 could not connect: dial tcp 192.168.1.4:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2020-11-19 15:10:06.467084 W | rafthttp: health check for peer e6cf4fe3e32b8396 could not connect: dial tcp 192.168.1.5:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2020-11-19 15:10:06.467113 W | rafthttp: health check for peer e6cf4fe3e32b8396 could not connect: dial tcp 192.168.1.5:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")

systemctl status kubelet (it is running but showing errors)

Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.461018    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.561183    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.653030    1074 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.1.3:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-master-01&limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.661422    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.761566    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.853243    1074 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://192.168.1.3:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.861820    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.962034    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:40 k8s-master-01 kubelet[1074]: E1119 18:49:40.053584    1074 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.1.3:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
Nov 19 18:49:40 k8s-master-01 kubelet[1074]: E1119 18:49:40.062258    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:40 k8s-master-01 kubelet[1074]: E1119 18:49:40.162491    1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:40 k8s-master-01 kubelet[1074]: E1119 18:49:40.253689    1074 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://192.168.1.3:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-master-01&limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused

Could someone advise what are further steps should be taken to bring at last one this node online.
Or maybe it will be easier to create new kubernetes cluster and move applications (8 images) there?

And where POD configuration can be found? I didn’t notice any *.yaml which describes what to do with all these images.

thanks in advance.

1 Like