Skip to content

nexus: update instance networking config after live migration#3127

Merged
gjcolombo merged 6 commits into
mainfrom
gjcolombo/lets-migrate/9-update-v2p-mappings
May 17, 2023
Merged

nexus: update instance networking config after live migration#3127
gjcolombo merged 6 commits into
mainfrom
gjcolombo/lets-migrate/9-update-v2p-mappings

Conversation

@gjcolombo

Copy link
Copy Markdown
Contributor

Whenever Nexus gets a new instance runtime state from a sled agent, compare the state to the existing runtime state to see if applying the new state will update the instance's Propolis generation. If it will, use the sled ID in the new record to create updated OPTE V2P mappings and Dendrite NAT entries for the instance.

Retry with backoff when sled agent fails to publish a state update to Nexus. This was required for correctness anyway (see #2727) but is especially important now that there are many more ways for Nexus to fail to apply a state update. See the comments in the new code for more details.

In the future, it might be better to update this configuration using a reliable persistent workflow that's triggered by Propolis location changes. This approach will require at least some additional work in OPTE to assign generation numbers to V2P mappings (Dendrite might have a similar problem but I'm not as familiar with the tables Nexus is trying to maintain in this change).

@gjcolombo gjcolombo requested a review from jmpesp May 15, 2023 18:58

@jmpesp jmpesp left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, some questions:

Comment thread nexus/src/app/instance.rs Outdated
Comment thread nexus/src/app/instance.rs Outdated
Comment thread nexus/src/app/sagas/instance_create.rs Outdated
Comment thread nexus/tests/integration_tests/instances.rs
Comment thread nexus/tests/integration_tests/instances.rs
Comment thread nexus/tests/integration_tests/instances.rs
Comment thread nexus/src/app/instance.rs Outdated
gjcolombo added 6 commits May 17, 2023 15:55
Whenever Nexus gets a new instance runtime state from a sled agent, compare the
state to the existing runtime state to see if applying the new state will update
the instance's Propolis generation. If it will, use the sled ID in the new
record to create updated OPTE V2P mappings and Dendrite NAT entries for the
instance.

Retry with backoff when sled agent fails to publish a state update to Nexus.
This was required for correctness anyway (see #2727) but is especially
important now that there are many more ways for Nexus to fail to apply a state
update. See the comments in the new code for more details.
@gjcolombo gjcolombo force-pushed the gjcolombo/lets-migrate/9-update-v2p-mappings branch from 1c36e9c to 3feca50 Compare May 17, 2023 15:56
@gjcolombo gjcolombo merged commit f77cf31 into main May 17, 2023
@gjcolombo gjcolombo deleted the gjcolombo/lets-migrate/9-update-v2p-mappings branch May 17, 2023 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants