nexus: release unused Propolis reservations after migration#3141
nexus: release unused Propolis reservations after migration#3141gjcolombo wants to merge 1 commit into
Conversation
When an instance's Propolis generation changes, examine how its Propolis ID and destination Propolis ID are changing and unregister and remove resource reservations from any Propolises that are no longer active. Update the migration integration test to check this behavior by checking to make sure that a simulated cluster that can fit exactly two instances can start one instance, migrate it, and then start the second instance.
cb9f4c8 to
04ba2a6
Compare
| /// Compares the existing Propolis record in CRDB (`db_instance`) with a new | ||
| /// instance runtime state supplied from a sled agent, determines whether | ||
| /// any Propolis VMMs will be left unused as a result of this transition, | ||
| /// and releases the resources that were reserved to them. |
| warn!(log, | ||
| "active Propolis did not change to migration target"; |
There was a problem hiding this comment.
Should this actually fail instead of just warn? An instance suddenly being served by a new separate propolis in the midst of a migration seems like a bug.
There was a problem hiding this comment.
I'll need to think about this one a little more. I agree that if we get to this path, we've definitely violated an invariant somewhere. I'm unsure what to do about it from this path, though. In theory, we can talk to any of the sleds involved from this path:
- this routine can return a response to this sled agent that tells it to do something specifically
- it can also use the
sled_idanddst_sled_idfrom the DB to contact either of the other involved sled agents
We could maybe use that to try to get all the instances to stop and then report the instance is Failed...? I'll need to mull it over (and I'm very much open to suggestions here :)).
(The main things I want to avoid are (a) silently going on like nothing's wrong, and (b) panicking--because this might mean that something unexpected is in CRDB, I want to avoid having the sled that's making this call go to each Nexus in turn and perform an operation that will deterministically cause it to panic.)
| } | ||
| None => { | ||
| warn!(log, | ||
| "active Propolis changed without an active migration"; |
There was a problem hiding this comment.
Not a blocker here but reminds me that maybe we should make active_propolis_id nullable on the instance record and allocate a new one whenever starting an instance. Anyways, that'd probably be part of making instance start a saga itself.
There was a problem hiding this comment.
| // The caller is going to replace the instance runtime state with | ||
| // the newly-advanced state it supplied, so there's no need to write | ||
| // the returned runtime state back to CRDB here. | ||
| let _ = old_sa.instance_unregister(&instance_id).await?; |
There was a problem hiding this comment.
Just to confirm, it's always the new sled-agent that would've triggered this nexus call right? i'd have to double check but if we got an update from the old sled-agent and tried to call back to it we might dead lock if it's holding the instance lock in sled-agent.
There was a problem hiding this comment.
Ooh, good catch.
Yes, right now it's always the new sled agent that triggers this call, though that's somewhat accidental--the update procedure is supposed to allow both sides to send the "transfer" state change, but the source doesn't do it because (today) the InstanceRuntimeState doesn't contain enough information about the destination for the source to be able to create the correct record. If it did have that information, though, we'd run right into the problem you're describing, because the monitor thread indeed holds the instance lock while publishing to Nexus.
In general, I'd prefer not to have the sled agent hold locks while calling out to other services. I think that's relatively doable in the monitor task's context. It might be more difficult in the propolis_ensure path, but that's less pressing for this PR because those calls to publish_state_to_nexus are explicitly only used when starting an instance without migrating it. I'll see if I can tighten up the monitor loop.
|
I'm going to approach this in a different way in an upcoming PR, so closing this one as obsolete. |
This is the last "big rock" PR in the live migration series. There will be some follow-up work to do to polish things up once this lands (#3107, #3139), but with this in place it should be possible to exercise live migration in the dogfood environment without progressively exhausting sleds' resources.
When an instance's Propolis generation changes, examine how its Propolis ID and destination Propolis ID are changing and unregister and remove resource reservations from any Propolises that are no longer active.
Update the migration integration test to check this behavior by checking to make sure that a simulated cluster that can fit exactly two instances can start one instance, migrate it, and then start the second instance.