Skip to content

mock-server: add single-step API#869

Merged
hawkw merged 4 commits into
masterfrom
eliza/mock-single-step
Feb 21, 2025
Merged

mock-server: add single-step API#869
hawkw merged 4 commits into
masterfrom
eliza/mock-single-step

Conversation

@hawkw

@hawkw hawkw commented Feb 21, 2025

Copy link
Copy Markdown
Member

For certain test scenarios, the propolis-mock-server ought to have a mechanism
for manual control of the mocked instance's progress through the state machine.
In particular, this is necessary for testing changes like
oxidecomputer/omicron#7548, which adds a timeout tracked by the sled-agent when
an instance is stopped. If Propolis is stuck and cannot progress, the sled-agent
will forcefully terminate it after that timeout...but testing this requires a
way to make the mock Propolis pretend to be stuck.

This commit adds the following new endpoints to the mock server which are not
part of the real propolis-server API:

  • PUT /mock/mode: sets a mock mode, either Run (the normal behavior), or
    SingleStep, where state transitions only ocur when asked for by the test.
  • GET /mock/mode: returns whether or not we are single-steppy
  • PUT /mock/step: advances to the next queued generation

Testing: I've written a test in Omicron that uses this, and I can make
the mock propolis get wedged in the correct place. So that's nice.

Closes #858

@hawkw hawkw requested a review from gjcolombo February 21, 2025 00:08
hawkw added a commit to oxidecomputer/omicron that referenced this pull request Feb 21, 2025
hawkw added a commit to oxidecomputer/omicron that referenced this pull request Feb 21, 2025

@gjcolombo gjcolombo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mechanics of this generally look fine--just one question about a possible semantic difference between the mock and real servers.

None
}
// Otherwise, if we have stepped to the requested generation, or
// if we are not in single-step mode, just return the current

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is subtly different from the real server's behavior, I think--should the generation number "floor" still apply in automatic mode?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I've misread this--if the requested gen hasn't been published yet, then the get on line 312 will return None and we'll end up getting the expected behavior.

I think there is still a (different) subtle difference here: if you ask this API for generation J, and the state machine has advanced to generation K > J, then the mock will return the state as it was at generation J, but the "real" state machine will return whatever data is latest. But I think this kind of difference is OK to have if it makes the test double more useful when writing sled-agent tests. (If/when we refactor so that the mock state machine is based on propolis-server's state driver, we might want to put this behavior into a different API instead of changing the semantics of instance-state-monitor, but that's a problem for another day.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we can make this behavior more realistic, but it's a bit annoying. I'd like to do that in a follow up.

None
}
// Otherwise, if we have stepped to the requested generation, or
// if we are not in single-step mode, just return the current

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I've misread this--if the requested gen hasn't been published yet, then the get on line 312 will return None and we'll end up getting the expected behavior.

I think there is still a (different) subtle difference here: if you ask this API for generation J, and the state machine has advanced to generation K > J, then the mock will return the state as it was at generation J, but the "real" state machine will return whatever data is latest. But I think this kind of difference is OK to have if it makes the test double more useful when writing sled-agent tests. (If/when we refactor so that the mock state machine is based on propolis-server's state driver, we might want to put this behavior into a different API instead of changing the semantics of instance-state-monitor, but that's a problem for another day.)

@hawkw hawkw merged commit 2652487 into master Feb 21, 2025
@hawkw hawkw deleted the eliza/mock-single-step branch February 21, 2025 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mock-server: want manual control of state transitions

2 participants