Skip to content

Phil/pub conflicts deux#1662

Merged
psFried merged 5 commits intomasterfrom
phil/pub-conflicts-deux
Oct 4, 2024
Merged

Phil/pub conflicts deux#1662
psFried merged 5 commits intomasterfrom
phil/pub-conflicts-deux

Conversation

@psFried
Copy link
Copy Markdown
Contributor

@psFried psFried commented Sep 26, 2024

Description:

Rolls up some refactors of the publish api and the remainder of the work for #1520.

This change should not really be visible to end users.
The one caveat is that now the publications.id column will no longer match up with live_specs.last_pub_id or publication_specs.pub_id, since the true publication id is now determined by the publisher. I added the publications.pub_id column in order to allow joining publications to those tables, since that's still useful for debugging.

Note that the publication_specs_ext view is unaffected by that change, since it does not join to publications. Thus flowctl catalog history still works without any changes.


This change is Reviewable

@psFried psFried force-pushed the phil/pub-conflicts-deux branch 2 times, most recently from 30cce7e to c59397b Compare September 27, 2024 16:21
@psFried psFried marked this pull request as draft September 27, 2024 17:04
@psFried
Copy link
Copy Markdown
Contributor Author

psFried commented Sep 27, 2024

I moved this back to a draft. At this point, I think it'd be unwise to merge/deploy it until after I'm back from vacation, so I'm thinking I might as well fold in the inferred schema changes to finish out the rest of #1520 .

Copy link
Copy Markdown
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great so far! Really like the DraftPublication refactor.

.await?;
let mut expanded_names = Vec::with_capacity(expanded_rows.len());
for exp in expanded_rows {
if self.filter_user_has_admin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be pushed down into fetch_expanded_live_specs ? I'm betting that routine can be simplified a bunch given the refined / simpler authorization expectations noted above ☝️

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it doesn't really seem worth it at this stage. Doing it in the db just saves us fetching specs that we'd just filter out. But at present there's only a handful of specs in the whole system that would actually be filtered out by this. Maybe worthwhile once we get more people sharing collections, though.

Passing an explicit publication id was important for our previous
approach to handling dependencies, but is no longer needed now that we
have touch publications. With a touch, the publication id is irrelevant
since the `last_pub_id` of the touched sped will not change.

Removing it from the `publish` function arguments allows the actual id
to be generated by the publisher, and sets us up for automatically
re-trying `PublicationSuperseded` errors, which requires generating a
new publication id.
@psFried psFried force-pushed the phil/pub-conflicts-deux branch from c59397b to 4e6f5ea Compare October 4, 2024 14:38
Significantly refactors the `Publisher`, in order to address a number of
outstanding issues. Introduces a `DraftPublication` struct that
represents the desire to publish a draft, along with associated metadata
and configuration. Also added the `Publisher::publish(&self, DraftPublication)`
function, which handles build, commit, and retries.

Fixes #1634 by bypassing authorization checks for controller-initiated
publications. The `verify_user_authz` field of `DraftPublication` can be
used to toggle this behavior.

Takes care of a portion of #1520 by generating a new publication id for
each attempted publication, rather than re-using the `publications.id`
column value. This required adding a new `publications.pub_id` column
to hold the effective publication id, since `publications.id` can no
longer be used to join to `publication_specs` or
`live_specs.last_pub_id`.
Updates the default publication retry policy to retry `BuildSuperseded`
and `PublicationSuperseded` errors. This addresses a portion of #1520
Updates the agent to always update collection inferred schemas on any
non-touch collection publications. This makes it impossible for a user
to accidentally clobber an embedded inferred schema definition.

Also removes inferred schema handling from the `validation` crate, since
the drafted specs will already contain the updated schemas. Now all
validation needs to do is to bundle the write schema if necessary.
Resolves #1520

Updates the agent discovers handler to no longer set `expect_pub_id` for
collection specs. The goal is to avoid conflicts for collections that
use inferred schemas. There's no longer a risk of clobbering inferred
schemas due to the previous commit, which ensures published collections
always use the latest inferred schema. So it seems relatively safe to
omit `expect_pub_id` for the collection specs, and doing so ensures that
publications cannot fail simply due to a concurrent update of the
inferred schema.
@psFried psFried force-pushed the phil/pub-conflicts-deux branch from cc269c4 to 970dfcf Compare October 4, 2024 15:30
@psFried psFried marked this pull request as ready for review October 4, 2024 16:53
Copy link
Copy Markdown
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@psFried psFried merged commit abac9c1 into master Oct 4, 2024
@psFried psFried deleted the phil/pub-conflicts-deux branch October 4, 2024 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agent: controller publications should bypass authorization checks

2 participants