Skip to content

inferred schema updates blocking interactive publications of captures and materializations #1520

@psFried

Description

@psFried

Say you have a capture and materialization with many bindings that use inferred schemas. When the control plane publishes the inferred schemas, it increases the last_pub_id of the collection. If a user goes into the UI to try to publish the capture or materialization, the collection gets added to the LiveCatalog with a last_pub_id that is potentially greater than the id of the current publication. This results in this error being returned.

Current plan (updated after discussion)

  • (Phil/new pub conflicts #1623) Switch to using last_build_id instead of last_pub_id for optimistic locking
  • (Phil/new pub conflicts #1623) Introduce the concept of a "touch" publication, which only updates the built_spec, leaving the spec and last_pub_id unchanged
  • (Phil/new pub conflicts #1623) Controllers will hash the last_pub_ids of all their dependencies in order to determine whether they need to do a touch publication to update their own built_spec.
    • This prevents the last_pub_id from being incremented when the dependencies of a spec have changed, thus avoiding ExpectPubIdNotMatched errors caused by that.
  • (Phil/pub conflicts deux #1662) Update the publications handler to always inject the current inferred schema into the models of drafted (non-touch) collections.
    • This allows the UI and flowctl to stop setting expectPubId on collection specs. They must set it currently to avoid clobbering inferred schema changes, but this will make clobbering inferred schemas impossible, so expectPubId will no longer be needed.
  • (Phil/pub conflicts deux #1662) Update the discovers handler to no longer set expect_pub_id for collection specs.
    • It must still set it for the capture specs, though.
    • This avoids the possibility for expectPubId conflicts on the collections
  • (Phil/pub conflicts deux #1662) Update the publications handler to generate a new pub_id for each publication instead of using the id of the publications row, and to retry PublicationSuperseded errors
    • Inventing a new pub_id is what allows us to retry these errors by generating a new (higher) pub_id

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions