mock-server: add single-step API#869
Conversation
This depends on oxidecomputer/propolis#869
This depends on oxidecomputer/propolis#869
gjcolombo
left a comment
There was a problem hiding this comment.
The mechanics of this generally look fine--just one question about a possible semantic difference between the mock and real servers.
| None | ||
| } | ||
| // Otherwise, if we have stepped to the requested generation, or | ||
| // if we are not in single-step mode, just return the current |
There was a problem hiding this comment.
This is subtly different from the real server's behavior, I think--should the generation number "floor" still apply in automatic mode?
There was a problem hiding this comment.
Ah, I've misread this--if the requested gen hasn't been published yet, then the get on line 312 will return None and we'll end up getting the expected behavior.
I think there is still a (different) subtle difference here: if you ask this API for generation J, and the state machine has advanced to generation K > J, then the mock will return the state as it was at generation J, but the "real" state machine will return whatever data is latest. But I think this kind of difference is OK to have if it makes the test double more useful when writing sled-agent tests. (If/when we refactor so that the mock state machine is based on propolis-server's state driver, we might want to put this behavior into a different API instead of changing the semantics of instance-state-monitor, but that's a problem for another day.)
There was a problem hiding this comment.
Yeah, I think we can make this behavior more realistic, but it's a bit annoying. I'd like to do that in a follow up.
| None | ||
| } | ||
| // Otherwise, if we have stepped to the requested generation, or | ||
| // if we are not in single-step mode, just return the current |
There was a problem hiding this comment.
Ah, I've misread this--if the requested gen hasn't been published yet, then the get on line 312 will return None and we'll end up getting the expected behavior.
I think there is still a (different) subtle difference here: if you ask this API for generation J, and the state machine has advanced to generation K > J, then the mock will return the state as it was at generation J, but the "real" state machine will return whatever data is latest. But I think this kind of difference is OK to have if it makes the test double more useful when writing sled-agent tests. (If/when we refactor so that the mock state machine is based on propolis-server's state driver, we might want to put this behavior into a different API instead of changing the semantics of instance-state-monitor, but that's a problem for another day.)
For certain test scenarios, the
propolis-mock-serverought to have a mechanismfor manual control of the mocked instance's progress through the state machine.
In particular, this is necessary for testing changes like
oxidecomputer/omicron#7548, which adds a timeout tracked by the sled-agent when
an instance is stopped. If Propolis is stuck and cannot progress, the sled-agent
will forcefully terminate it after that timeout...but testing this requires a
way to make the mock Propolis pretend to be stuck.
This commit adds the following new endpoints to the mock server which are not
part of the real
propolis-serverAPI:PUT /mock/mode: sets a mock mode, eitherRun(the normal behavior), orSingleStep, where state transitions only ocur when asked for by the test.GET /mock/mode: returns whether or not we are single-steppyPUT /mock/step: advances to the next queued generationTesting: I've written a test in Omicron that uses this, and I can make
the mock propolis get wedged in the correct place. So that's nice.
Closes #858