gRPC health checks, interested?

development

#1

For those deploying gRPC servers to Kubernetes, I am looking for community feedback about having native health checks in gRPC. (This is currently discussed in issue #21493.)

Questions I have for you:

  • Do you use Kubernetes readiness/liveness probes features on your gRPC server?
  • If so, what kind of probe you use? (There are plenty of methods to choose from, like shipping a client in the container image and use the exec probe)
  • Are you aware of gRPC Health Check Protocol? If not, would you consider refactoring your application to implement this rpc Check() to get a Kubernetes-native health checking for gRPC servers?

#2

we have probe endpoints defined in the app. Liveness just returns 200 when everything is up and running, Readiness checks connection to the database as well.

I wasn’t aware of the gRPC Health Check Protocol, will check.


#3

It doesn’t sound like you’re using grpc? Everything will return 200 in grpc even when it is failing. And kubernetes cannot make grpc calls.


#4

We are using grpc with grpc-gateway, so there are endpoints for /liveness and /readiness, then when container (app server) is running, status code 200 is returned. The grpc-gateway lives in a different container in the same pod.


#5

Update: I’ve been working on a solution for this –and released https://github.com/grpc-ecosystem/grpc-health-probe/. It’s specifically designed with Kubernetes in mind.

grpc_health_probe gives you a command-line tool to ping gRPC servers, as long as they implement the standard health-checking protocol. You can use this with Kubernetes liveness/readiness “exec” probes.

I’ll be writing a blog post about this on gRPC or Kubernetes blog soon.


#6

I m able to exec to grpc-health-probe binary.But, facing some problems with tls. Here is what I did.

  1. Created custom keystore and started grpc server on secured port.
  2. Hitting the server with command grpc-health-probe -addr=host:8443 -tls -tls-no-verify or /grpc-health-probe -addr=host:8443 -tls -tls-ca-cert
  3. Getting below error
    timeout: failed to connect service “host:8443” within 1s

Anybody tried this and succeeded?