Skip to content

[reconfigurator] Refactor MGS driven updates#9118

Merged
karencfv merged 11 commits into
oxidecomputer:mainfrom
karencfv:refactor-blocked-mgs
Oct 2, 2025
Merged

[reconfigurator] Refactor MGS driven updates#9118
karencfv merged 11 commits into
oxidecomputer:mainfrom
karencfv:refactor-blocked-mgs

Conversation

@karencfv

Copy link
Copy Markdown
Contributor

Just a few style changes as suggested in #9001

Closes: #9068

@karencfv karencfv requested a review from jgallagher September 30, 2025 02:38
Comment thread nexus/reconfigurator/planning/src/mgs_updates/mod.rs Outdated

@jgallagher jgallagher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this cleanup is very nice!

Comment on lines +505 to +506
#[error("failed Host OS update: {0:?}")]
HostOs(FailedHostOsUpdateReason),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these should be using error sources and not embedding debug strings; e.g.,

Suggested change
#[error("failed Host OS update: {0:?}")]
HostOs(FailedHostOsUpdateReason),
#[error("failed Host OS update")]
HostOs(#[from] FailedHostOsUpdateReason),

(We also have https://github.com/oxidecomputer/omicron/blob/main/docs/error-types-and-logging.adoc if you want much more detail on this.)

Should the error messages also be "failed to plan"? As written it sounds like we failed to perform an update.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes total sense thanks! I should really read that doc again 😄

Comment thread nexus/reconfigurator/planning/src/mgs_updates/host_phase_1.rs Outdated
Comment thread nexus/reconfigurator/planning/src/mgs_updates/mod.rs Outdated
.pending_persistent_boot_preference,
expected_transient_boot_preference: rot_state
.transient_boot_preference,
Ok(MgsUpdateOutcome::Pending(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style idea, not sure if I like it - what would you think about constructors on MgsUpdateOutcome to avoid (a) having to call Box::new() everywhere and (b) having to call PendingHostPhase2Changes::empty() for non-host-OS updates? Something like

// construct a non-host OS pending outcome
MgsUpdateOutcome::pending_without_host_phase_2(pending_mgs_update);

// construct a host OS pending outcome
MgsUpdateOutcome::pending_with_host_phase_2(pending_mgs_update, pending_host_phase2_changes);

@karencfv karencfv Oct 1, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like pending_with_host_phase_2 doesn't really differ from just directly setting MgsUpdateOutcome::Pending(update, pending_host_phase_2_changes). But, I like having the constructor for setting a pending update only. I've changed it this way. Let me know what you think!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me!

@karencfv karencfv left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look @jgallagher! I think I've addressed all of your comments. Please let me know if I missed anything :)

.pending_persistent_boot_preference,
expected_transient_boot_preference: rot_state
.transient_boot_preference,
Ok(MgsUpdateOutcome::Pending(

@karencfv karencfv Oct 1, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like pending_with_host_phase_2 doesn't really differ from just directly setting MgsUpdateOutcome::Pending(update, pending_host_phase_2_changes). But, I like having the constructor for setting a pending update only. I've changed it this way. Let me know what you think!

Comment on lines +505 to +506
#[error("failed Host OS update: {0:?}")]
HostOs(FailedHostOsUpdateReason),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes total sense thanks! I should really read that doc again 😄

@jgallagher jgallagher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Just a couple small nits.

Comment on lines +518 to +522
pub trait FailedMgsUpdateReasonComponent {
/// Converts a failed planned update from a specific component into a
/// FailedMgsUpdateReason
fn into_generic(self) -> FailedMgsUpdateReason;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this trait - given the #[from]s inside each of the FailedMgsUpdateReason variants, we can say either err.into() or FailedMgsUpdateReason::from(err) to convert any of the component-specific errors.

pending_host_os_phase2_changes,
)) => {
pending_actions.add_pending_update(update);
// Host OS updates also return phase 2 changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit - this comment seems a little confusing/out of place, since update_attempt might not be a host OS update. I could see striking this altogether, or maybe making it clear that this set will be empty for non-host OS updates? Something like "If update_attempt is a host OS update, stage the phase 2 changes. For any other type, this set will be empty."?

@karencfv karencfv enabled auto-merge (squash) October 1, 2025 22:11
@karencfv karencfv merged commit f2a3273 into oxidecomputer:main Oct 2, 2025
17 checks passed
@karencfv karencfv deleted the refactor-blocked-mgs branch October 2, 2025 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[reconfigurator] Improve blocked MGS driven update code

2 participants