Skip to content

Health Discovery Service #1310

@htuch

Description

@htuch

The v2 APIs introduce HDS - https://github.com/lyft/envoy-api/blob/master/api/hds.proto (@amb67). It works as follows (pasting in the .proto comment):

  1. Envoy starts up and if its can_healthcheck option in the static bootstrap config is enabled, sends HealthCheckRequest to the management server. It supplies its capabilities (which protocol it can health check with, what zone it resides in, etc.).
  2. In response to (1), the management server designates this Envoy as a healthchecker to health check a subset of all upstream hosts for a given cluster (for example upstream Host 1 and Host 2). It streams HealthCheckSpecifier messages with cluster related configuration for all clusters this Envoy is designated to health check. Subsequent HealthCheckSpecifier message will be sent on changes to:
    a. Endpoints to health checks
    b. Per cluster configuration change
  3. Envoy creates a health probe based on the HealthCheck config and sends it to endpoint(ip:port) of Host 1 and 2. Based on the HealthCheck configuration Envoy waits upon the arrival of the probe response and looks at the content of the response to decide whether the endpoint is healthy or not. If a response hasn’t been received within the timeout interval, the endpoint health status is considered TIMEOUT.
  4. Envoy reports results back in an EndpointHealthResponse message. Envoy streams responses as often as the interval configured by the management server in HealthCheckSpecifier.
  5. The management server collects health statuses for all endpoints in the cluster (for all clusters) and uses this information to construct EDS DiscoveryResponse messages.
  6. Once Envoy has a list of upstream endpoints to send traffic to, it load balances traffic to them without additional health checking. It may use inline healthcheck (i.e. consider endpoint UNHEALTHY if connection failed to a particular endpoint to account for health status propagation delay between HDS and EDS).

This issue will track implementation work on HDS.

Metadata

Metadata

Assignees

Labels

api/v3Major version release @ end of Q3 2019no stalebotDisables stalebot from closing an issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions