MGS: Implement "/sp" endpoint to fetch state for all SPs#746
Conversation
|
A couple basic tests are in place as of 90c49ca. I'd like to add more tests of some of the more complicated cases (e.g., an unresponsive SP), but that will require some more work in the SP simulator. May do that as part of this PR or as a followup, unsure at the moment. |
Exercises the `/sp` endpoint to get state of all SPs.
0424cb3 to
bf114de
Compare
|
Force pushed to account for #770 |
| // TODO we're dropping the error on the floor here - how should | ||
| // we handle it? This is an SP that we actively failed to | ||
| // communicate with somehow, which isn't the same as | ||
| // "unresponsive". Should we fail the entire request? That's how |
There was a problem hiding this comment.
In what situations might we hit this error?
- something screwed up with the network configuration i.e. such that we get an error from the OS with an improper VLAN tag or something
- a response from the SP that indicates an error... in which case that SP is in a weird state to be able to respond with an error but not with the simplest kind of message it might reasonably answer
Anything else?
There was a problem hiding this comment.
The difficulty in answering this question is why I want to do some error cleanup! I can think of at least one other case that has to be handled, although in practice I don't expect to ever see it absent some horrible deployment mismatch nightmare: the SP sends a non-error response of a type that doesn't make sense (e.g., we ask it for its state and it responds with "here's a list of my components").
There are three things I want to change after working on this PR:
/spendpoint added here given its complexity. I'll do this before merging, but I think it's fine to start reviewing, particularly if there are any nontrivial changes that would affect those tests.SpIdentifier, sometimes its aSocketAddr, sometimes it's an ignition target (which itself is just an index, and is sometimesu8and sometimesusize). I've been punting on this until we have a bit better understanding of how MGS is going to interact with the management network and track rack topology, but it's pretty unwieldy at this point. I'll do this separately too.There's also an open question of whether the SP communications should be separated out from
gatewayentirely so that they can be shared with RSS. I'm strongly inclined to do this (and try to at least make progress on items 2 and 3 above in doing so) even if RSS ends up calling MGS instead of communicating directly, just from a crate cleanliness/organization point of view. Thoughts welcome!