[internal-dns] register and publish ddmd in the switch zone#10381
Conversation
DDMD has always run in the switch zone alongside Dendrite, MGS, and MGD, but it was never registered in internal DNS, leaving no path for a cross-host consumer to discover it. This adds `ServiceName::Ddm`, plumbs `ddm_port` through the host-zone switch (RSS plan + reconfigurator DNS execution), threads an `Overridables::ddm_ports` map for the test suite, and lands a `DdmInstance` dropshot sim in test utils so that the test harness registers a real DDM port in DNS the same way it does for the other switch-zone services. We also drop the duplicate DDMD_PORT const in `ddm-admin-client` in favor of the canonical `omicron_common::address::DDMD_PORT`. Same-host callers continue to use `Client::localhost()`. This was extracted from the multicast PR (zl/multicast-mgd-ddm), which uses ddmd cross-host as the first DNS-resolved consumer, as Nexus is the consumer.
jgallagher
left a comment
There was a problem hiding this comment.
Thank you very much for splitting this out!
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
…fixture We address @jgallagher's review by: - Replacing the four positional `u16` arguments in `DnsConfigBuilder::host_zone_switch` with a `HostSwitchZonePorts` named-fields structure. - Replacing the dropshot-based stubbed `DdmInstance` in test-utils with a fixture that spawns and supervises a real `ddmd` subprocess running with `--no-state-machine`, analogous to `MgdInstance` and `mgd --no-bgp-dispatcher`. Only the switch-zone `ddmd` is registered in internal DNS, while sled-global-zone instances are accessed locally by their own host and don't need DNS registration. This **does** require maghemite changes, already PR'ed to oxidecomputer/maghemite#729. To make this all work, we wire `ddmd` into the developer xtask toolchain. `cargo xtask download maghemite-ddmd` reuses the existing `mg-ddm.tar.gz` illumos zone artifact (extracting `ddmd`/`ddmadm`). On Linux it overlays a raw `ddmd` binary, and on macOS it builds from source. Also, we had to bump `oxnet` from 0.1.4 to 0.1.5 to satisfy the new maghemite pin.
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
e3c6a18 to
9824436
Compare
f69d6d4 to
e212660
Compare
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
Picks up recent oxidecomputer/maghemite#729 (ddmd --api-only flag) and the preceding main changes that moved canonical types out of the auto-generated client into the `mg-api-types` crate. Includes: - replaces `rdb-types` (removed upstream) with `mg-api-types` as a direct workspace dep - bumps `num_enum` 0.7.5 -> 0.7.6 to satisfy maghemite's workspace pin - migrates types - renames `bgp_apply_v2` callers to `bgp_apply` - `DdmInstance` fixture is renamed from `--no-state-machine` to `--api-only` to match the new clap flag.
jgallagher
left a comment
There was a problem hiding this comment.
Changes all LGTM - just one comment about PR ordering.
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--api-only` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces a `backend` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "backend", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--api-only` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. `--api-only` and `--addr` are mutually exclusive at the clap level (`conflicts_with`), so passing them together is rejected at parse time.
|
|
3d213d7 to
9958291
Compare
This brings main forward and updates maghemite to current main (9bb5037167c1ff0d812299f668841c9b7bda4480, including the merged PR oxidecomputer/maghemite#729 with the ddmd --api-only flag). We also bump workspace clap from 4.5 to 4.6 to satisfy the new maghemite constraint. The lockfile cascades through to align omicron-as-git refs at 915f229 too. Finally, we patch `oxlog` to the `[patch."github.com/oxidecomputer/omicron"]` list to resolve a duplicate-package error from maghemite's transitive illumos-utils -> oxlog pull.
9958291 to
d250ae7
Compare
|
@taspelund this also gets us aligned with all your work in maghemite so far. |
Awesome, thanks zeeshan! The diff looks okay to me, but I haven't come up to speed on omicron yet so I'd definitely rely on John's review from that standpoint |
jgallagher
left a comment
There was a problem hiding this comment.
LGTM - one small nit, and a question about cross-repo dependencies.
…one more mags update
|
Was holding on incorporating the latest main from @taspelund, but we we waiting on the lab-3.0-gimlet image there for falcon's CI job. @taspelund, thoughts? |
Final, pre-review pass on this work. It stacks atop #10070 and inherits the multicast-to-physical (M2P) underlay forwarding and VMM-keyed instance subscription endpoints. This also builds on and integrates #10381. Above these foundations, this work includes the final pass on mgd-ddmd integration: * Reconciler correctness: * `set_mcast_m2p` rolls back the xde M2P entry on per-NIC join failure, so the reconciler converges on a retry instead of leaving stale state pointing at the wrong underlay address. * `propolis_id` is threaded end-to-end through the sled-agent multicast endpoints to deal with live migration ambiguity. * MRIB advertisement is gated on a flag rather than running unconditionally after the DPD match arm, so that a DPD failure no longer leaves a route advertised via DDM with no programmed forwarding state. * OPTE hardening (illumos-utils): * M2P entries upserted into a `BTreeMap<IpAddr, MulticastUnderlay>` rather than a Vec on the non-illumos mock, eliminating duplicate-key corner cases the production map already avoided. * `MulticastFilterMap` encapsulates the per-NIC filter socket and refcount state previously open-coded inside `PortManagerInner`, concentrating the "join socket per underlay group per NIC" invariant into one singular type. * underlay_nics typed as &[AddrObject] rather than &[String]. * Per-NIC IPV6_JOIN_GROUP calls converted from libc::setsockopt to nix::sys::socket::setsockopt for the typed bind. * Sled-agent (real and sim): * Sim v7 multicast endpoints fall through to the trait defaults instead of overriding with just `unimplemented!()`, matching how other versioned endpoints behave in the sim. * Sim VMM existence check on join/leave restored. * Configuration: * `MulticastGroupReconcilerConfig` gains a group_concurrency_limit and member_concurrency_limit bounding the per-pass fan-out of the RPW's buffer_unordered streams. * Test infra: * `populate_ddm_peers` no longer caches the peer map. The previous cache was keyed by sled-id set, but the synthesized port names embedded each sled's `sp_slot` from inventory, so cache reuse within the same sled set could produce stale port mappings. * Documentation cleanup across the RPW, sled-agent multicast paths, and the new(er) sled-agent types module.
Final, pre-review pass on this work. It stacks atop #10070 and inherits the multicast-to-physical (M2P) underlay forwarding and VMM-keyed instance subscription endpoints. This also builds on and integrates #10381. Above these foundations, this work includes the final pass on mgd-ddmd integration: * Reconciler correctness: * `set_mcast_m2p` rolls back the xde M2P entry on per-NIC join failure, so the reconciler converges on a retry instead of leaving stale state pointing at the wrong underlay address. * `propolis_id` is threaded end-to-end through the sled-agent multicast endpoints to deal with live migration ambiguity. * MRIB advertisement is gated on a flag rather than running unconditionally after the DPD match arm, so that a DPD failure no longer leaves a route advertised via DDM with no programmed forwarding state. * OPTE hardening (illumos-utils): * M2P entries upserted into a `BTreeMap<IpAddr, MulticastUnderlay>` rather than a Vec on the non-illumos mock, eliminating duplicate-key corner cases the production map already avoided. * `MulticastFilterMap` encapsulates the per-NIC filter socket and refcount state previously open-coded inside `PortManagerInner`, concentrating the "join socket per underlay group per NIC" invariant into one singular type. * underlay_nics typed as &[AddrObject] rather than &[String]. * Per-NIC IPV6_JOIN_GROUP calls converted from libc::setsockopt to nix::sys::socket::setsockopt for the typed bind. * Sled-agent (real and sim): * Sim v7 multicast endpoints fall through to the trait defaults instead of overriding with just `unimplemented!()`, matching how other versioned endpoints behave in the sim. * Sim VMM existence check on join/leave restored. * Configuration: * `MulticastGroupReconcilerConfig` gains a group_concurrency_limit and member_concurrency_limit bounding the per-pass fan-out of the RPW's buffer_unordered streams. * Test infra: * `populate_ddm_peers` no longer caches the peer map. The previous cache was keyed by sled-id set, but the synthesized port names embedded each sled's `sp_slot` from inventory, so cache reuse within the same sled set could produce stale port mappings. * Documentation cleanup across the RPW, sled-agent multicast paths, and the new(er) sled-agent types module.
Final, pre-review pass on this work. It stacks atop #10070 and inherits the multicast-to-physical (M2P) underlay forwarding and VMM-keyed instance subscription endpoints. This also builds on and integrates #10381. Above these foundations, this work includes the final pass on mgd-ddmd integration: * Reconciler correctness: * `set_mcast_m2p` rolls back the xde M2P entry on per-NIC join failure, so the reconciler converges on a retry instead of leaving stale state pointing at the wrong underlay address. * `propolis_id` is threaded end-to-end through the sled-agent multicast endpoints to deal with live migration ambiguity. * MRIB advertisement is gated on a flag rather than running unconditionally after the DPD match arm, so that a DPD failure no longer leaves a route advertised via DDM with no programmed forwarding state. * OPTE hardening (illumos-utils): * M2P entries upserted into a `BTreeMap<IpAddr, MulticastUnderlay>` rather than a Vec on the non-illumos mock, eliminating duplicate-key corner cases the production map already avoided. * `MulticastFilterMap` encapsulates the per-NIC filter socket and refcount state previously open-coded inside `PortManagerInner`, concentrating the "join socket per underlay group per NIC" invariant into one singular type. * underlay_nics typed as &[AddrObject] rather than &[String]. * Per-NIC IPV6_JOIN_GROUP calls converted from libc::setsockopt to nix::sys::socket::setsockopt for the typed bind. * Sled-agent (real and sim): * Sim v7 multicast endpoints fall through to the trait defaults instead of overriding with just `unimplemented!()`, matching how other versioned endpoints behave in the sim. * Sim VMM existence check on join/leave restored. * Configuration: * `MulticastGroupReconcilerConfig` gains a group_concurrency_limit and member_concurrency_limit bounding the per-pass fan-out of the RPW's buffer_unordered streams. * Test infra: * `populate_ddm_peers` no longer caches the peer map. The previous cache was keyed by sled-id set, but the synthesized port names embedded each sled's `sp_slot` from inventory, so cache reuse within the same sled set could produce stale port mappings. * Documentation cleanup across the RPW, sled-agent multicast paths, and the new(er) sled-agent types module.
|
This now includes the necessary maghemite fixes around deps. |
6da3b62 to
c8571fb
Compare
|
@jgallagher, @taspelund maybe worth one more review (during closed gate) after the latest maghemite was brought in. |
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
DDMD has always run in the switch zone alongside Dendrite, MGS, and MGD, but it was never registered in internal DNS, leaving no path for a cross-host consumer to discover it. This adds `ServiceName::Ddm`, plumbs `ddm_port` through the host-zone switch (RSS plan + reconfigurator DNS execution), threads an `Overridables::ddm_ports` map for the test suite, and includes a `DdmInstance` test fixture in test-utils that spawns a real `ddmd` subprocess via `--api-only` (matching `MgdInstance`'s pattern) so that the test harness registers a real DDM port in DNS the same way it does for the other switch-zone services. We also drop the duplicate DDMD_PORT const in `ddm-admin-client` in favor of the canonical `omicron_common::address::DDMD_PORT`. Same-host callers continue to use `Client::localhost()`.
In PR #10381, this binary was added as a dependency of the omicron tests, without which they will always fail. For NixOS users, this means it must also be added to the Nix flake in order for the tests to be runnable on NixOS. Commit 9a8539b does that. Most of the Nix code for fetching ddmd and making a `ddmd` derivation was nearly identical with the code for doing the same for `mdg`. Thus, I also factored it out into a reusable thingy in a subsequent commit, c615beb.
DDMD has always run in the switch zone alongside Dendrite, MGS, and MGD, but it was never registered in internal DNS, leaving no path for a cross-host consumer to discover it. This adds
ServiceName::Ddm, plumbsddm_portthrough the host-zone switch (RSS plan + reconfigurator DNS execution), threads anOverridables::ddm_portsmap for the test suite, and includes aDdmInstancetest fixture in test-utils that spawns a realddmdsubproc via--no-state-machine(matchingMgdInstance's pattern) so that the test harness registers a real DDM port in DNS the same way it does for the other switch-zone services.We also drop the duplicate DDMD_PORT const in
ddm-admin-clientin favor of the canonicalomicron_common::address::DDMD_PORT. Same-host callers continue to useClient::localhost().The legit subproc fixture depends on oxidecomputer/maghemite#729, which adds a
no-state-machineflag toddmdthat skips the kernel-related state machine and leaves only the admin API running.This was extracted from the multicast PR (zl/multicast-mgd-ddm), which uses ddmd cross-host as the first DNS-resolved consumer, as Nexus is the consumer.
References