Skip to content

prevent Nexus from performing schema updates when it's been MUPdated #8954

@davepacheco

Description

@davepacheco

After #8912, when Nexus starts up and determines that it's part of the set of Nexus instances currently in control, it will perform any needed database schema update. In the normal self-service update case, this is necessary and correct. However, if we had a normally running system after 8912 and someone were to MUPdate a sled that has a Nexus on it to a new version of Nexus with a new database schema, it would find that it's still in the active set and do a schema update that would pull the rug out from the other Nexus instances (violating our runtime constraint that all Nexus instances only ever use the database running the schema version that they know about).

The assumption is that this would probably have been an operator / support mistake. It's always going to be dangerous to MUPdate a sled that's part of the control plane while the control plane is running because of problems like this (and analogous problems with Crucible or Oximeter, which have their own on-disk formats and schema versions), and you could be invalidating inter-API version dependencies, etc. But this Nexus case is particularly bad since in principle it could lead to permanent control plane database corruption. If it's easy to prevent, that seems worthwhile. On today's watercooler call we discussed preventing this by having the db_metadata_nexus records contain the Nexus image id, and having Nexus know its own image id, and then having it stop on startup if it finds its image doesn't match the one in its record. (We'd presumably also want it to be able to expose this status somewhere so that an operator or support can see what's happened.)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions