mgmt-gateway: update host flash#762
Conversation
30c0ca1 to
5129fd1
Compare
| most_recent_error: Option<HfError>, | ||
| } | ||
|
|
||
| impl Default for HostFlashSectorErase { |
There was a problem hiding this comment.
Do we need this Default implementation? It seems a little non-obvious (e.g. I could imagine that the default would be to erase all sectors, instead of none!)
There was a problem hiding this comment.
Hmm, we need something to the effect of it unless I make HostFlashUpdate::sector_erase an Option. That might be clearer; I'll give it a shot.
There was a problem hiding this comment.
Removed Default and ::start() in favor of ::new() with the parent storing an Option<_> of us in 5959e7a. I think that's more clear and obvious - thanks!
| stream_id: u64, | ||
| ) -> Result<(), ResponseError> { | ||
| match self.current_update_stream_id { | ||
| None => Err(ResponseError::UpdateNotPrepared), |
There was a problem hiding this comment.
Nit: is this the right error code?
Earlier in the PR, UpdateNotPrepared seems to mean "you sent a chunk while the update while still in the UpdatePrepare state", rather than "there is no current update happening at all".
There was a problem hiding this comment.
This method is a little annoying and will be reworked in the next PR; I think this is a reasonable error code given where this is called, but it's not obvious at all. I'm going to leave this as-is and revisit after the upcoming changes.
mkeeter
left a comment
There was a problem hiding this comment.
A few small comments, but generally looks good
This tracks the corresponding changes to MGS messaging merged in oxidecomputer/omicron#1684: 1. `UpdateStart` has been broken into `UpdatePrepare` and `UpdatePrepareStatus` (which MGS will continue to send periodically until we respond that preparation is done), allowing for updates that have a potentially-long running prep step (like updating host flash, which can take up to several minutes to erase!). 2. Update messages now include a stream-id that we use to correlate related messages; we reject update messages that don't match our current stream ID. 3. Add handling for the new `UpdateAbort` abort message to cancel an in-progress update. By far the most complex bit of this is 1: I've moved setting the system timer out of `mgs_gimlet` (which previously set it only in relation to flushing serial console uart packets out to MGS) and into `main`: now the MGS handler only returns the deadline it wants `main` to set. If we're in the process of prepping for a host flash update (i.e., we need to erase the host flash), we'll set our deadline to 1 tick from now. When it fires, we'll erase 8 sectors (takes about 1 second, worst case), then return and allow `main` to check for other work. This allows us to continue to be responsive to incoming notifications, importantly network requests (allowing us to respond to the `UpdatePrepareStatus` messages in a timely way!).
5129fd1 to
5959e7a
Compare
* mgmt-gateway: update host flash This tracks the corresponding changes to MGS messaging merged in oxidecomputer/omicron#1684: 1. `UpdateStart` has been broken into `UpdatePrepare` and `UpdatePrepareStatus` (which MGS will continue to send periodically until we respond that preparation is done), allowing for updates that have a potentially-long running prep step (like updating host flash, which can take up to several minutes to erase!). 2. Update messages now include a stream-id that we use to correlate related messages; we reject update messages that don't match our current stream ID. 3. Add handling for the new `UpdateAbort` abort message to cancel an in-progress update. By far the most complex bit of this is 1: I've moved setting the system timer out of `mgs_gimlet` (which previously set it only in relation to flushing serial console uart packets out to MGS) and into `main`: now the MGS handler only returns the deadline it wants `main` to set. If we're in the process of prepping for a host flash update (i.e., we need to erase the host flash), we'll set our deadline to 1 tick from now. When it fires, we'll erase 8 sectors (takes about 1 second, worst case), then return and allow `main` to check for other work. This allows us to continue to be responsive to incoming notifications, importantly network requests (allowing us to respond to the `UpdatePrepareStatus` messages in a timely way!).
* mgmt-gateway: update host flash This tracks the corresponding changes to MGS messaging merged in oxidecomputer/omicron#1684: 1. `UpdateStart` has been broken into `UpdatePrepare` and `UpdatePrepareStatus` (which MGS will continue to send periodically until we respond that preparation is done), allowing for updates that have a potentially-long running prep step (like updating host flash, which can take up to several minutes to erase!). 2. Update messages now include a stream-id that we use to correlate related messages; we reject update messages that don't match our current stream ID. 3. Add handling for the new `UpdateAbort` abort message to cancel an in-progress update. By far the most complex bit of this is 1: I've moved setting the system timer out of `mgs_gimlet` (which previously set it only in relation to flushing serial console uart packets out to MGS) and into `main`: now the MGS handler only returns the deadline it wants `main` to set. If we're in the process of prepping for a host flash update (i.e., we need to erase the host flash), we'll set our deadline to 1 tick from now. When it fires, we'll erase 8 sectors (takes about 1 second, worst case), then return and allow `main` to check for other work. This allows us to continue to be responsive to incoming notifications, importantly network requests (allowing us to respond to the `UpdatePrepareStatus` messages in a timely way!).
This tracks the corresponding changes to MGS messaging merged in oxidecomputer/omicron#1684:
UpdateStarthas been broken intoUpdatePrepareandUpdatePrepareStatus(which MGS will continue to send periodically until we respond that preparation is done), allowing for updates that have a potentially-long running prep step (like updating host flash, which can take up to several minutes to erase!).UpdateAbortabort message to cancel an in-progress update.By far the most complex bit of this is 1: I've moved setting the system timer out of
mgs_gimlet(which previously set it only in relation to flushing serial console uart packets out to MGS) and intomain: now the MGS handler only returns the deadline it wantsmainto set. If we're in the process of prepping for a host flash update (i.e., we need to erase the host flash), we'll set our deadline to 1 tick from now. When it fires, we'll erase 8 sectors (takes about 1 second, worst case), then return and allowmainto check for other work. This allows us to continue to be responsive to incoming notifications, importantly network requests (allowing us to respond to theUpdatePrepareStatusmessages in a timely way!).