After reboot of K8S exactly 40secs later the services are failing

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version:1.14.7
Cloud being used: (put bare-metal if not on a public cloud)
Installation method:
Host OS: SLES15
CNI and version:
CRI and version:

You can format your yaml by highlighting it and pressing Ctrl-Shift-C, it will make your output easier to read.

This is pretty vague and could mean anything. What do you mean when you say that services are failing?

For example, is kubelet dying? Are your pods not behaving as you expect?

Also what logs have you checked?

hello , here is the analysis i have put it as a flow , please let me know what logs you are looking for there are so many logs to refer to…

That doesn’t really help me to help you. To clarify, you need to be more rigid in explaining what you’re observing. Assume we know nothing at all about your problem, what you’re seeing, or what application you’re talking about; because we actually don’t.

So right now this is where were at diagnostically:

  • What service is failing? (Defining “service” is important, because that’s vague and could mean anything from api-server to your own application; it could also mean the Service object type)
  • What are you observing that leads you to say that it’s failing? (Are you seeing something in a log? Can you not make an HTTP or TCP request to a port?)

Also be verbose: share what troubleshooting steps you’ve done and what you observed in detail. When people see someone trying, other people instinctively want to help.

If I recall, 40 seconds is how long the controller-manager waits for a dead node to come back before it starts declaring the pods on that node unready.

1 Like

Steps

  1. Install K8S , Docker , Helm
  2. install pods/services - logging , postgres,storage and few more
  3. simply reboot
  4. services restarts and after 40 secs it shows as Running all of a sudden it starts failing
  5. Within 10 secs again it shows running and all pods are up and running

would like to know why this is failing and runs successfully later

So this is expected behaviour by K8S? just curious to know why it fails in between post reboot