Who/where actually work liveness probe in kubernetes?

GwiYeong_Jeong · February 15, 2019, 5:17am

in my kubernetes cluster, http liveness probe always failed with this message

Liveness probe failed: Get http://10.233.90.72:8080/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

so, coredns and kubernetes-dashboard (any other using http liveness probe) pods being infinitely restart.

while pod running (between events start and restart), i check the endpoints for the pod with executing command curl http://10.233.90.72:8080/health on the busyboxplus pod. this command are working normally, i can see OK return. but liveness probe still failed. pod is restarting…

in this situation, i want to debug liveness probe, but i don’t have any idea who/where actually work liveness probe in kubernetes? is this pod? or node?

how can i debug liveness probe? does anyone have same issue…?

please advice for me.

kubectl versions:
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

version info:
 OS: Ubuntu 18.04
 Kubernetes: 1.13.3
 Docker: 18.09.2

i also asked on stackoverflow. https://stackoverflow.com/questions/54702668/who-where-actually-work-liveness-probe-in-kubernetes

thanks in advance

thockin · February 15, 2019, 5:30am

Can you access the pod IPs from the nodes themselves? Kubelet doesn’t (generally) run in a Pod.

rata · February 15, 2019, 5:32am

I think that the docs say the kubelet (i.e the node) runs those probes.

The root problem may vary, there might be too little cpu for them, be overloaded, etc. Do you have metrics to try to understand what is happening in the cluster?

GwiYeong_Jeong · February 15, 2019, 5:48am

yes, in the node, i can access the pod ip and health check url
in the node.

curl http://10.233.123.147:8080/health
OK

GwiYeong_Jeong · February 15, 2019, 5:51am

how can i see metrics…?
i have 3 machine(actually vm) for kubernetes cluster. they have 4core cpu, 4gb mem, 50gb ssd.

GwiYeong_Jeong · February 15, 2019, 6:15am

omg…!!
i have 3 nodes, node1 node2 node3
and i have 2 coredns pods. they actually exist in node1, node2.
when i test in node1, i can access coredns pod in the node2, but i can not access coredns pod in the node1!!!..

for example

in node1, coredns1 - 1.1.1.1
in node2, coredns2 - 2.2.2.2

in node1. 
  access 1.1.1.1:8080/health -> timeout
  access 2.2.2.2:8080/health -> ok

in node2. 
  access 1.1.1.1:8080/health -> ok
  access 2.2.2.2:8080/health -> timeout

real example

traceroute node1
10.233.90.73 -> in node1
10.233.66.14 -> in node2

root@node1:  traceroute 10.233.66.14
traceroute to 10.233.66.14 (10.233.66.14), 64 hops max
  1   10.233.66.0  2.670ms  3.469ms  3.941ms
  2   10.233.66.14  1.403ms  0.345ms  0.236ms

root@node1: traceroute 10.233.90.73
traceroute to 10.233.90.73 (10.233.90.73), 64 hops max
  1   *  *  *
  2   *  * ^C

i use calico for cni

how can i fix it?

thockin · February 15, 2019, 6:26am

Check with the Calico folks. I suspected this was your failure mode.

GwiYeong_Jeong · February 15, 2019, 7:19am

ok thanks!

Topic		Replies	Views
Kuberentes liveness cannot visit service by dns General Discussions	5	1250	April 12, 2019
Apiserver liveness and readiness probes fail randomly with code 500 General Discussions	3	11599	May 23, 2023
APi Server is showing "Liveness probe failed: HTTP probe failed with statuscode: 500" while describing the pod General Discussions apiserver	1	8498	January 19, 2024
Generic health checks (probes) General Discussions	10	1849	June 10, 2022
How to reschedule pod on another node if node fails? How to speed up rescheduling? General Discussions	1	12418	July 17, 2019

Who/where actually work liveness probe in kubernetes?

for example

real example

Related topics