validation: enforce data plane alignment with storage mappings and add flowctl discover#2404
validation: enforce data plane alignment with storage mappings and add flowctl discover#2404jgraettinger merged 6 commits intomasterfrom
flowctl discover#2404Conversation
These were historically included because they were part of models::Catalog, and were thus covered by `validation` as part of integrated snapshot testing with the `sources` crate. That's no longer true: storage mappings are injected into the live catalog and are not part of the draft. So, remove them from snapshots as the original rationale for including them no longer holds.
…gnment Storage mappings already included a data_planes field that wasn't being used. This change activates that functionality - tasks are now assigned to data planes based on their storage mapping's data_planes list, with the first entry as default. Users can still explicitly override with a specific data plane if needed. Key changes: - Remove `is_default` field from data planes table - Enforce that tasks use data planes from their storage mapping's list - Use first data plane in storage mapping as default for task initialization - Add explicit_plane parameter to allow user overrides (validated against mapping) - Validate alignment between partition and recovery storage mappings - Add better error messages when data planes are missing or mismatched
Previously, flowctl's local_specs::Resolver used NoOpCatalogResolver which provided a placeholder storage mapping and data plane. This change updates it to query actual storage mappings and data planes from the control plane via PostgREST APIs. This brings local validation into closer alignment with that of production and makes it possible for `flowctl` to understand the data-plane that a new specification should be submitted to. `--default-data-plane` is renamed to `--init-data-plane` and is now optional, and None by default. When None, new specifications are placed in the first data-plane of its covering storage mapping.
Implement `flowctl discover` command that submits discovery jobs to the control plane rather than running connectors locally. The command: - Loads and validates source specifications - Creates a draft with encrypted endpoint configurations - Submits discovery job to the discovers table with appropriate data-plane - Polls the job while streaming logs until completion - Downloads the updated draft to local files This provides a similar UX to `flowctl raw discover` but leverages the control plane's discovery infrastructure rather than local connector execution.
|
@psFried what's the workflow to update versioned sqlx queries? And no changes to .sqlx/ |
flowctl discoverflowctl discover
|
@jgraettinger I think your |
| D::ModelDef, // Model to validate. | ||
| models::Id, // Live control-plane ID. | ||
| models::Id, // Assigned data-plane. | ||
| &'a tables::DataPlane, // Assigned data-plane. |
There was a problem hiding this comment.
This return type is getting pretty unwieldy, and I'm thinking that at some point it'll probably be better to declare a TransitionOk<'a, D, L, B> struct. Doesn't need to be now, though.
| auto_approve: bool, | ||
| /// Data-plane into which created specifications will be placed. | ||
| #[clap(long, default_value = "ops/dp/public/gcp-us-central1-c2")] | ||
| default_data_plane: String, |
There was a problem hiding this comment.
There's docs in site/docs/guides/flowctl/ci-cd.md that refer to this argument, and will need updated.
May or may not be worth it, but we could also leave this argument here, but hidden, and print an error message if it gets passed. Just thinking about users upgrading, and wanting to make sure we communicate the breaking change effectively. Would also be good if we remember to include this in the release notes.
There was a problem hiding this comment.
Updated docs, and I added back --default-data-plane as an alias for --init-data-plane.
Small update to flowctl documentation as well.
|
## What's Changed This release introduces support for the [new `redact` annotation](estuary/flow#2383), which enables blocking or hashing portions of documents very early in capture process. It also adds a new [`flowctl discover` subcommand](estuary/flow#2404), which enables CLI-driven capture creation workflows. ## New Contributors * @danielnelson made their first contribution in estuary/flow#2403 **Full Changelog**: estuary/flow@v0.5.21...v0.5.22
## What's Changed This release introduces support for the [new `redact` annotation](estuary/flow#2383), which enables blocking or hashing portions of documents very early in capture process. It also adds a new [`flowctl discover` subcommand](estuary/flow#2404), which enables CLI-driven capture creation workflows. ## New Contributors * @danielnelson made their first contribution in estuary/flow#2403 **Full Changelog**: estuary/flow@v0.5.21...v0.5.22
## What's Changed This release introduces support for the [new `redact` annotation](estuary/flow#2383), which enables blocking or hashing portions of documents very early in capture process. It also adds a new [`flowctl discover` subcommand](estuary/flow#2404), which enables CLI-driven capture creation workflows. ## New Contributors * @danielnelson made their first contribution in estuary/flow#2403 **Full Changelog**: estuary/flow@v0.5.21...v0.5.22
Description:
This PR implements enforcement of our emergent "Prefix" concept - a jointly verified tuple of (catalog-prefix, storage-buckets, admissible-data-planes). The validation layer now enforces that data planes must be drawn from the data_planes field of the best-covering storage mapping, with additional consistency checks to ensure storage mapping alignment.
Changes:
flowctl discovercommand submits discovery jobs to the control plane for CLI-driven discoverySee individual commits for more detail.
Workflow steps:
flowctl discoversubmits discovery jobs to the control plane's discovers table and fetches the resultant draft for local developmentflowctlno longer sends a data plane parameter unless explicitly specified by the user via--init-data-plane, which is renamed from--default-data-plane.Example of
flowctl discover: