-
Notifications
You must be signed in to change notification settings - Fork 4.1k
liveness: allow testing main heartbeat loop deterministically #107452
Description
Liveness starts a goroutine that heartbeats the own liveness record periodically. This means that lots of tests that interact with liveness need to exert control over this goroutine. It would be easier to test NodeLiveness if the concurrency were externalized. In other words, rather than NodeLiveness.Start spawning a goroutine that loops and runs code that can only be reached by that goroutine, the goroutine should be a method on NodeLiveness that can be invoked manually, and Start should take a suitably defined abstraction over a "periodic runner". In prod, the periodic runner would be the familiar async task with a for loop. In testing, it may just be a no-op, and the harness can call the method directly whenever it wants to pretend the ping interval elapsed.
It's a bit tricky to get this right but there is probably an abstraction here that applies similarly to many other subsystems in CRL that start auxiliary goroutines (which are then difficult to test against). We should architect to give tests as much control over concurrency as possible.
Jira issue: CRDB-30050