-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Description
The current implementation of healthy_edge_interval and unhealthy_edge_interval will wait for the health threshold to be reached before the interval is used.
That leads to situations like this:
healthy_thresholdis set to 2;- host health state is currently
unhealthy; - host health check fails, next check happens after
unhealthy_interval; - host health check succeeds, next check happens after
interval; - host health check succeeds, next check happens after
healthy_edge_interval; - host health check succeeds, next check happens after
interval.
The behavior above defeats the purpose of having an edge interval since its goal is to detect health state changes faster whilst reducing the burden of health checks on healthy hosts.
The intended behavior to achieve edge interval's purpose on the same scenario would be:
- host health check fails, next check happens after
unhealthy_interval; - host health check succeeds, next check happens after
healthy_edge_interval; - host health check succeeds, next check happens after
interval; - host health check succeeds, next check happens after
interval.
Repro steps:
- configure
unhealthy_thresholdto a value bigger than 1; - configure an arbitrary value for
unhealthy_interval; - configure
unhealthy_edge_intervalto a value different thanunhealthy_interval; - cause a network timeout in the backend host and verify that the second failed health check will happen
unhealthy_intervalafter the first failed one, instead ofunhealthy_edge_intervalafter.
Reactions are currently unavailable