Conversation
Adds a `PartialEq` implementation for all models. This sets the stage for future work in publications and validation, which needs to determine whether models have changed.
…ames Updates `tables::DraftCatalog::all_catalog_names` to include materialization `sourceCapture`s. This is required in order to correctly track spec dependencies during validation, which will be introduced in a subsequent commit. It also makes things consistent, as now a `sourceCapture` is always considered a dependency of a materialization, whereas before it depended on the situation.
8df5d9d to
22d0f96
Compare
Lays some groundwork for "touch" publications and computing dependency hashes. These changes need to be paired together, becuase the use of touch publications is incompatible with our current approach to dependency change detection. **Touch operations:** Are defined as publications where the draft row sets `is_touch: true` and has a model that is identical to the live model. A touch does not update the `last_pub_id`, though the agent changes for implementing that will come in a subsequent commit. The intent is to use a touch in order to update the `built_spec` in response to changes in dependencies. **Dependency hash:** A dependency hash is an opaque value that is computed from the names and `last_pub_id`s of every spec that a given spec depends on. The concept of dependencies was introduced along with controllers, but the implementation of change detection was not fully correct. Using a hash guarantees (% the infinitesimal chance of hash collision) correctnes of change detection. This introduces the computation of the hash as part of validation. **Other changes:** Threads through `last_build_id`, and lays some of the groundwork (in `validation::walk_transition`) for using `last_build_id` instead of `last_pub_id` for optimistic locking. Rename `BuiltSpec::is_unchanged` to `is_passthrough`, since the meaning of `is_unchanged` became unclear with the introduction of touch operations, which explicitly don't change the model.
e7ebcf0 to
fa9f9b0
Compare
Updates the agent to support `is_touch` in publications and the `dependency_hash` in publications and controllers. For specs published with `is_touch=true`, the agent will _not_ add new `publication_specs`, update the `last_pub_id`, or update the `spec` (the drafted spec must be identical). The dependency hash column is added, and persisted as part of committing a publication. Controllers compute the current hash value whenever they run, and compare it to the persisted value in order to determine whether any dependencies have changed. When controllers detect a change in dependencies, their behavior generally depends on whether any dependencies have been deleted or not. As long as no dependencies were deleted, then controllers will publish their spec using `is_touch = true`. If any dependencies have been deleted, then task controllers will use a normal publication and update the model to remove the deleted dependencies, which matches the previous behavior. Controllers will attempt to collapse multiple subsequent touch publications into their publication histories in an attempt to limit the noise of many publications in response to dependency changes. The `ActivationStatus` has been updated to use `last_build_id` instead of `last_pub_id` as the high water mark for activations. The previous behavior was never fully correct, but `last_build_id` must increase with every modification of the `built_spec`, so this new behavior is guaranteed not to miss activating any changes. Going along with the change to `ActivationStatus`, the `CONTROLLER_VERSION` constant was incremented in order to prevent old agent versions from clobbering changes to the activation status during agent rollout.
fa9f9b0 to
e6f1a59
Compare
Updates controllers to set the `detail` for touch publications, to make debugging easier. Also cleans up a few comments, and skip serializing `is_touch` in publication histories when it's `false`.
3f7eabd to
9c94e72
Compare
jgraettinger
left a comment
There was a problem hiding this comment.
LGTM! This looks great. I have quibbles below about whether dependency_hash is a String or a u64, and its Option-ality, but this looks correct & solid to me.
| // Built specification of this capture as-of `last_pub_id`. | ||
| val spec: proto_flow::flow::CaptureSpec, | ||
| // Hash of the last_pub_ids of all the dependencies that were used to build the capture | ||
| val dependency_hash: Option<String>, |
There was a problem hiding this comment.
What about a u64 digest instead of String?
Also, why is this Option? Under the control plane data-model, it seems a dependency digest is a property of every live spec that must exist.
(Yes, there's probably a migration involved, but initializing with a digest of String("") or u64(0) would work, right? As in it might update once due to "mismatch" and then be good thereafter?
There was a problem hiding this comment.
My thinking for using a string instead of a u64 is that it's more obviously something that should be treated as opaque.
The Option just seemed convenient as a way to represent a spec that has no dependencies. For example, a collection, or a capture with no bindings.
I don't feel especially strongly about either of those things, and could get behind a u64, where a 0 means it has no dependencies. It's convenient to have a simple constant representing "no dependencies", though, because for example when we validate collection specs, we know that they can't have dependencies and so don't consult the Dependencies::compute_hash function at all. That sound good to you?
There was a problem hiding this comment.
The Option just seemed convenient as a way to represent a spec that has no dependencies. For example, a collection, or a capture with no bindings.
I suppose I don't see "zero bindings" as different from "one or more bindings". It's all the same continuum, an empty set of bindings can be represented in the same hash space as non-empty ones, and thus why jump through the extra Some-check hoops to model the distinction. Or, if it remains a string, an empty string could also serve as the current None.
I've said my piece, though, and it's so minor I'm okay any way you want to roll, including merging as-as.
There was a problem hiding this comment.
That's a fair point. I'm inclined to just merge as-is, though, as it doesn't seem like we'll actually regret the current representation.
| // Expected last build ID for optimistic concurrency. | ||
| val expect_build_id: models::Id, | ||
| // Hash of the last_pub_ids of all the dependencies that were used to build the capture | ||
| val dependency_hash: Option<String>, |
There was a problem hiding this comment.
Same as above, this doesn't seem like it should be Option.
I suppose an argument would be when representing deleted specs (spec is None). Still, I think I'd expect the value to be digest([ /* empty slice of (name, last_pub_id) */])
| anyhow::bail!("missing spec for expanded row: {:?}", exp.catalog_name); | ||
| }; | ||
| let scope = tables::synthetic_scope(spec_type, &exp.catalog_name); | ||
| // TODO(phil): currently we set `is_touch: false`, which is consistent with the prior |
There was a problem hiding this comment.
agreed, i believe these should be touches too
Use consistent field ordering in `tables` and make `is_touch` part of the return value of `into_parts`.
03d9928 to
b26c774
Compare
Description:
Introduce "touch" publications, and use them to update specs in response to changes in their dependencies.
This implements a portion of the work that's required for #1520. The remainder of that work will come in subsequent PRs.
A "touch" is a new way to publish a spec in cases where the model itself is unchanged, and only the
built_specneeds to be updated. Unlike a normal publication, a touch does not modify thelast_pub_idor thespec, and nopublication_specsare created. This means that touches do not appear in the publication history in the UI. Only thebuilt_specandlast_build_idare modified, in order to embed thebuilt_specs of dependencies.Controllers are updated to use a touch operation when responding to updates of their dependencies. For example, a materialization controller will use a touch to update the materialization
built_specin response to a change thelast_pub_idof any of it's source collections orsourceCapture.In order to determine whether dependencies have changed, a new
dependency_hashcolumn has been added. This is needed because the mechanism we used previously could miss updates to dependencies in some cases. The hash is computed during validation by hashing the(catalog_name, last_pub_id)of each dependency of the spec. Note that this is based onlast_pub_idand notlast_build_id. This means that a touch of a dependency (e.g. collection) does not change thedependency_hashof any dependent specs (e.g. a materialization).1Another benefit of touch publications to reduce the noise in the publication history. We'll no longer clog up the
_/details/historyUI with a bunch ofin response to changes in one or more dependenciespublications. The controller status does still record touch publications, though, as they can be important for debugging various scenarios. In order to cut down on the noise in the status history, consecutive touch publications are now folded into a single entry in the history as long as they have the same result. Thecountfield tracks the number of publications in each entry, and thedetailandcompletedfields always reflect that of the most recent publication.Example materialization controller status showing
is_touchpublications being collapsed in the history.{ "type": "Materialization", "source_capture": { "up_to_date": true }, "publications": { "max_observed_pub_id": "0ead0b8e1a0003dc", "history": [ { "id": "0ead0a8e0c0003dc", "created": "2024-09-11T14:36:24.472050767Z", "completed": "2024-09-11T14:38:36.787707336Z", "detail": "", "result": { "type": "success" }, "is_touch": true, "count": 6 }, { "id": "0ead082a130003dc", "created": "2024-09-11T14:31:11.142596754Z", "completed": "2024-09-11T14:31:12.461721809Z", "detail": "adding binding(s) to match the sourceCapture: [phil/a/wd-2]", "result": { "type": "success" }, "is_touch": false }, { "id": "0ead04d71b0003dc", "created": "2024-09-11T14:23:55.446274518Z", "completed": "2024-09-11T14:30:56.615646164Z", "detail": "", "result": { "type": "success" }, "is_touch": true, "count": 7 } ] }, "activation": { "last_activated": "0ead0b53890003dc" } }Workflow steps:
I've been testing using a webhook capture and a shell script ingest documents with new properties in a loop. This generates synthetic load by causing the inferred schema to be updated repeatedly. While that's running, I've been able to publish the materialization via the UI, which previously would have failed due to the
last_pub_idbeing updated. Publishing the Capture via the UI will still fail with anExpectPubIdNotMatchederror because thelast_pub_idof the Collection is still updated whenever the inferred schema is updated. A future PR will enable us to stop settingexpect_pub_idfor collections, which should mitigate these errors.I tested the upgrade path locally, and didn't find any issues with the normal process. Just apply the migration and then start deploying the new agents. The new agent processes will set
controller_version = 2, which will prevent the old agents from clobbering their changes.Documentation links affected: n/a
Notes for reviewers:
This change is
Footnotes
An early apparent complication to this was storage mappings. There's no explicit dependency between live specs and their storage mappings, and we always use the latest mapping whenever we publish a given collection, and so we might worry that storage mappings could be updated and fail to propagate through to the
built_specs of dependent captures and materializations. This turned out to be a non-issue because we create a new (regular, not touch) publication of all specs whenever a storage mapping is updated, which increases thelast_pub_idof all affected collections. Thus all dependent tasks will respond with a touch that updates their ownbuilt_specto include the updated storage mappings. ↩