Rework MGS internals for clarity#1583
Conversation
andrewjstone
left a comment
There was a problem hiding this comment.
This looks great @jgallagher. The organization changes make this a lot cleaner IMO, and having each SP handled in its own task likely provides better scalability/performance.
|
|
||
| #[derive(Debug)] | ||
| pub struct AttachedSerialConsole { | ||
| key: u64, |
There was a problem hiding this comment.
What's the purpose of key? Are multiple serial attachments to the same SP allowed?
There was a problem hiding this comment.
Not simultaneously; the purpose of key here is to distinguish "detach any currently-attached serial console client" (i.e., serial_console_detach() above, which sends a key of None) from "detach this currently-attached serial console client (i.e., AttachedSerialConsoleSend::send() below, which sends a key of Some(self.key)). This is (maybe only partially?) explained below:
// The associated value is the connection key; if
Some(_), only detach if
// the currently-attached key number matches. IfNone, detach any current
// connection. These correspond to "detach the current session" (performed
// automatically when a connection is closed) and "force-detach any session"
// (performed by a user).
In hindsight I should expand this: ultimately this is to avoid a race condition like this:
- Client A attaches to the serial console.
- Client B tries to attach and fails.
- Client B sends a "detach any existing clients", and then successfully attaches.
- Client A drops their connection, and sends a "detach my current connection".
With this ordering, we don't want step 4 to detach client B from step 3; the key makes this work because the key sent in step 4 won't match the key established in step 3, so the detach will be a noop (the fact that the key is different means client A's session has already been detached).
There was a problem hiding this comment.
Ok, this makes perfect sense. I somewhat gathered that after writing this question from another comment in the code where it talked about attaching after detach, but wasn't 100% sure so left my question.
4960419 to
0b3b584
Compare
The changes include: * Removal of `RecvHandler` and all its supporting machinery; instead of a monster tokio task reponsible for recv'ing on all sockets, pairing up request/response IDs, etc., we spawn a `SingleSp` task for each switch port that manages all communication on that port's socket. * Removal of the `timeout` option to most SP calls. `SingleSp` already has inherent (configured) timeouts for retries. At the HTTP client level, the client can already establish its own timeouts (e.g., in the event the network connection between client and server goes down); we don't need separate timeouts for SP communications (which the client has no reason to know how to set). * The serial console websocket handling has moved from `gateway-sp-comms` to `gateway`.
0b3b584 to
20ea79f
Compare
Crucible: Skip jobs when reinitializing to `Faulted` (#1583) Clear `repair_check_deadline` if repair is successfully started (#1581) Update rust-version reqs to reflect reality (#1580) Propolis: Update nvme-trace.d to match current probe definitions (#821) Fix clippy lints for Rust 1.83 PHD: use stty to widen the effective terminal for Linux guests (#818) Co-authored-by: Alan Hanson <alan@oxide.computer>
This is the followup PR to #1579, reworking MGS proper to use the
SingleSpintroduced in that PR forfaux-mgs. The changes include:RecvHandlerand all its supporting machinery; instead ofa monster tokio task reponsible for recv'ing on all sockets, pairing
up request/response IDs, etc., we spawn a
SingleSptask for eachswitch port that manages all communication on that port's socket.
timeoutoption to most SP calls.SingleSpalreadyhas inherent (configured) timeouts for retries. At the HTTP client
level, the client can already establish its own timeouts (e.g., in the
event the network connection between client and server goes down); we
don't need separate timeouts for SP communications (which the client
has no reason to know how to set).
gateway-sp-commstogateway.