This may be an issue related to the dropshot repo too, as Omicron isn't the sole user of this pattern.
This issue describes a process which is documented in our README, but which is still painful: https://github.com/oxidecomputer/omicron?tab=readme-ov-file#generated-service-clients-and-updating
Background
- We use dropshot to generate a JSON OpenAPI spec, that is used to generate client crates
- When using dropshot, we write servers (see:
http_entrypoints.rs throughout the codebase) which utilize dropshot macros and ingest types to define the specification.
- These endpoints are capable of emitting an openapi specification, if and only if they can compile. This often takes the from of an EXPECTORATE test, which spits out that aforementioned JSON file.
When is this a problem
Mostly: When things don't compile, it can be exceptionally difficult to "regenerate the interface we want".
In the cleanest scenario, you'd make changes to your implementation AND interface together, then run the "openapi spec generation tests", then apply those changes to all clients which exist in the codebase. However, there are scenarios where this breaks:
- Circular dependencies. For example, Nexus exposes an internal API to sled agent, and sled agent exposes an API to Nexus. This isn't necessarily bad! They both have reasons to communicate with each other. However, updating interactions between the two servers often involves re-running the JSON generation several times, then dealing with breakages, and then re-building. At an interface level, this is fine, but it's painful that "the rest of the crate (nexus or sled agent)'s implementation must also compile first, before the openapi test can run". This isn't too bad for adding new endpoints, but for altering interfaces that exist, it often means filling up a codebase with
todo!() and commented-out code to get to a compiling state, then undoing all those changes immediately afterwards.
- Merge conflicts. This forces you to deal with an openapi spec that is out-of-sync from main, but also potentially out-of-sync from your current tree. Again, this forces the same type of solution: find everyone using types that are generated by the openapi spec, and "force them to do the wrong thing, as long as it compiles".
A better world?
It would be really nice to decouple the implementation of these interfaces from the interface declarations themselves.
For example, if the Sled Agent API was not actually part of the sled-agent crate, but rather, a separate "sled-agent-interface" crate, we could do the following:
EXPECTORATE=overwrite cargo nt -p sled-agent-interface could re-generate the client bindings, regardless of whether or not "the rest of sled agent compiles"
- The
sled-agent crate could depend on the sled-agent-interface crate, and use it as a normal part of exposing an HTTP server
- Any other code attempting to implement the interface could share the server code. For example, the "simulated sled agent" -- which currently copies the implementation from
sled-agent's http_entrypoints.rs -- could actually rely on the same interface.
The main advantage here would be "when you want to generate the new interface, do that first, before necessarily touching the implementation" and that would work. This should remove the need for "commented-out" code to get things to compile, since it would no longer be necessary to make the rest of the crate arbitrarily compile before generating the new interface.
This may be an issue related to the dropshot repo too, as Omicron isn't the sole user of this pattern.
This issue describes a process which is documented in our README, but which is still painful: https://github.com/oxidecomputer/omicron?tab=readme-ov-file#generated-service-clients-and-updating
Background
http_entrypoints.rsthroughout the codebase) which utilize dropshot macros and ingest types to define the specification.When is this a problem
Mostly: When things don't compile, it can be exceptionally difficult to "regenerate the interface we want".
In the cleanest scenario, you'd make changes to your implementation AND interface together, then run the "openapi spec generation tests", then apply those changes to all clients which exist in the codebase. However, there are scenarios where this breaks:
todo!()and commented-out code to get to a compiling state, then undoing all those changes immediately afterwards.A better world?
It would be really nice to decouple the implementation of these interfaces from the interface declarations themselves.
For example, if the Sled Agent API was not actually part of the sled-agent crate, but rather, a separate "sled-agent-interface" crate, we could do the following:
EXPECTORATE=overwrite cargo nt -p sled-agent-interfacecould re-generate the client bindings, regardless of whether or not "the rest of sled agent compiles"sled-agentcrate could depend on thesled-agent-interfacecrate, and use it as a normal part of exposing an HTTP serversled-agent'shttp_entrypoints.rs-- could actually rely on the same interface.The main advantage here would be "when you want to generate the new interface, do that first, before necessarily touching the implementation" and that would work. This should remove the need for "commented-out" code to get things to compile, since it would no longer be necessary to make the rest of the crate arbitrarily compile before generating the new interface.