Boundary health endpoints

Boundary provides health monitoring through the /health path using a listener with the "ops" purpose. By default, a listener with that purpose runs on port 9203. See the example configuration section for an example listener stanza in a config.hcl file.

Requirements

To enable the controller health endpoint, any Boundary instance must be started with a controller. That is, a controller block and a purpose = "api" listener must be defined in Boundary's configuration file. Additionally, a purpose = "ops" listener must also be defined in Boundary's configuration file. Under these conditions, the ops server (which hosts the controller health api) will be exposed.

Shutdown grace period

Optionally, when the controller health endpoint is enabled, you can configure the controller health response to 503 Service Unavailable upon receiving a shutdown signal, and wait a configurable amount of time before starting the shutdown process.

This feature supports load balancing by reducing the risk of an outgoing Boundary instance causing disruption to incoming requests.

In this state, Boundary is still capable of processing requests as normal, but will report as unhealthy through the controller health endpoint. In load-balanced environments, this would cause this "unhealthy" instance to be removed from the pool of instances eligible to handle requests, reducing the likelihood that it will receive a request to handle during shutdown.

This feature is disabled by default, even if the controller health endpoint is enabled. You can enable it by defining graceful_shutdown_wait_duration in the controller block of Boundary's configuration file. The value should be set to a string that is parseable by ParseDuration.

API

The controller health service introduces a single read-only endpoint:

Status	Description
`200`	`GET /health` returns HTTP status 200 OK if the controller's api gRPC Server is up
`5xx`	`GET /health` returns HTTP status `5XX` or request timeout if unhealthy
`503`	`GET /health` returns HTTP status `503 Service Unavailable` status if the controller is shutting down

All responses return empty bodies. GET /health does not support any input.

Check the health endpoint using `wget`

The Boundary Docker image includes wget. You can use it to check the health endpoint. Enterprise and Community edition users can check the health of controllers and workers. HCP Boundary users can check the health of their self-managed workers.

Use the following command to check a worker's status to determine if it is healthy:

$ wget -q -O - http://localhost:9203/health?worker_info=1 | grep -c 'READY'

-q - Prints the response instead of saving it to a file.
-0 - Allows Boundary to pass the worker's URL.

If the command prints 1 to stdout or exits with exit code 0, it means the worker is healthy. If the exit code is greater than 0, the worker is unhealthy.

You can also use the following command without grep -c 'READY' to print a more verbose response that includes the worker's state, active session count, and connection state:

$ wget -q -O - http://localhost:9203/health?worker_info=1

Example response

When you check the health endpoint using wget, Boundary returns a response with the following worker information:

$ wget -q -O - http://localhost:9203/health?worker_info=1 | grep -c 'READY'{"worker_process_info":{"state":"active", "active_session_count":0, "upstream_connection_state":"READY"}}

When you check the health endpoint using curl, Boundary returns a response with the following worker information:

$ curl "worker:9403/health?worker_info=1"{   "worker_process_info":{      "state":"active",      "active_session_count":0,      "upstream_connection_state":"READY"   }}

The response contains the following fields:

state - The operational state of the worker. Possible worker states include active, shutdown, and unknown.
active_session_count - The number of active sessions on the worker.
upstream_connection_state - The connection state of the worker. This value indicates whether the worker can connect to an upstream address or connection. It can be any of the following states:
- CONNECTING - The channel is trying to make a connection and is waiting for name resolution or the connection establishment.
- READY - The channel has successfully established a connection, and any attempts to communicate have succeeded.
- TRANSIENT_FAILURE - The channel suffered a transient failure such as a time out or socket error. A channel in this state will switch to the CONNECTING state and try to establish a connection.
- IDLE - The channel is not attempting to create a connection because there are no calls.
- SHUTDOWN - The channel is shutting down. Any new calls fail immediately. Pending calls may continue running until Boundary cancels them.

Example configuration

Health checks are available for a controller defined with a purpose = "ops" listener stanza. For details on what fields are allowed in this stanza, refer to the documentation about TCP Listener.

An example listener stanza:

controller {  name = "boundary-controller"  database {      url = "postgresql://<username>:<password>@10.0.0.1:5432/<database_name>"  }} listener "tcp" {    purpose = "api"    tls_disable = true} listener "tcp" {    purpose = "ops"    tls_disable = true}

To enable a shutdown grace period, update the controller block with a defined wait duration:

controller {  name = "boundary-controller"  database {      url = "env://BOUNDARY_PG_URL"  }  graceful_shutdown_wait_duration = "10s"}

A complete example can be found under the Controller configuration docs.