gRPC health checks, interested?

For those deploying gRPC servers to Kubernetes, I am looking for community feedback about having native health checks in gRPC. (This is currently discussed in issue #21493.)

Questions I have for you:

  • Do you use Kubernetes readiness/liveness probes features on your gRPC server?
  • If so, what kind of probe you use? (There are plenty of methods to choose from, like shipping a client in the container image and use the exec probe)
  • Are you aware of gRPC Health Check Protocol? If not, would you consider refactoring your application to implement this rpc Check() to get a Kubernetes-native health checking for gRPC servers?
3 Likes

we have probe endpoints defined in the app. Liveness just returns 200 when everything is up and running, Readiness checks connection to the database as well.

I wasn’t aware of the gRPC Health Check Protocol, will check.

1 Like

It doesn’t sound like you’re using grpc? Everything will return 200 in grpc even when it is failing. And kubernetes cannot make grpc calls.

We are using grpc with grpc-gateway, so there are endpoints for /liveness and /readiness, then when container (app server) is running, status code 200 is returned. The grpc-gateway lives in a different container in the same pod.

1 Like

Update: I’ve been working on a solution for this –and released https://github.com/grpc-ecosystem/grpc-health-probe/. It’s specifically designed with Kubernetes in mind.

grpc_health_probe gives you a command-line tool to ping gRPC servers, as long as they implement the standard health-checking protocol. You can use this with Kubernetes liveness/readiness “exec” probes.

I’ll be writing a blog post about this on gRPC or Kubernetes blog soon.

2 Likes

I m able to exec to grpc-health-probe binary.But, facing some problems with tls. Here is what I did.

  1. Created custom keystore and started grpc server on secured port.
  2. Hitting the server with command grpc-health-probe -addr=host:8443 -tls -tls-no-verify or /grpc-health-probe -addr=host:8443 -tls -tls-ca-cert
  3. Getting below error
    timeout: failed to connect service “host:8443” within 1s

Anybody tried this and succeeded?