Skip to content

Sync a single module by default, at the root of the repo#2330

Merged
unmultimedio merged 64 commits intomainfrom
jfigueroa/sync-default-behavior
Aug 8, 2023
Merged

Sync a single module by default, at the root of the repo#2330
unmultimedio merged 64 commits intomainfrom
jfigueroa/sync-default-behavior

Conversation

@unmultimedio
Copy link
Member

@unmultimedio unmultimedio commented Aug 1, 2023

This PR sets a much saner default behavior and sync rules when running buf alpha repo sync:

  • Don't force users to pass --module flag each time for all module dirs. If they don't, then assume a single module, at the root of the git repo.
  • Read module identities from the HEAD commit for each branch, and only sync commits in the branch history that match those names.
  • Skip commits with read module failures like:
    • Module not found
    • Module with invalid config
    • Module unnamed
    • Module name different than HEAD
    • Module does not build
  • Improve sync output every time a git commit is synced to a new more readable format

Base automatically changed from jfigueroa/sync-single-branch to main August 1, 2023 17:42
@unmultimedio unmultimedio requested review from a team, bufdev, doriable and seankimdev August 3, 2023 23:03
@unmultimedio unmultimedio marked this pull request as ready for review August 3, 2023 23:03
}

// Module is a module that will be synced by Syncer.
type Module interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since --module flag no longer has the format of --module <module-dir>:<remote-module-identity-override>, but just --module <module-dir>, this was not needed anymore.

Comment on lines -202 to +159
// Identity is the identity of the module, accounting for any configured override.
Identity() bufmoduleref.ModuleIdentity
// Bucket is the bucket for the module.
Bucket() storage.ReadBucket
// Commit is the commit that the module is sourced from.
Commit() git.Commit
// Branch is the git branch that this module is sourced from.
Branch() string
// Commit is the commit that the module is sourced from.
Commit() git.Commit
// Tags are the git tags associated with Commit.
Tags() []string
// Directory is the directory relative to the root of the git repository that this module is
// sourced from.
Directory() string
// Identity is the identity of the module.
Identity() bufmoduleref.ModuleIdentity
// Bucket is the bucket for the module.
Bucket() storage.ReadBucket
Copy link
Member Author

@unmultimedio unmultimedio Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorted by visit order:

  • branch > commit(tags) > directory > module (identity, bucket)

and added directory.

return stopLoopErr
}
commitHash := commit.Hash().Hex()
modulesDirsToSyncInThisCommit := make(map[string]bufmodulebuild.BuiltModule)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will hold the built module in here, so we read each module on each commit just once, instead of reading it here and again while syncing.

// On the other hand, if this func returns true for `ReadModuleErrorCodeModuleNotFound`, the
// syncer will stop looking when reaching the commit `u`, will select `v` as the start sync point,
// and the synced commits into the BSR will be [x, y, z].
StopLookback(err *ReadModuleError) bool
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, if we want in the future, a flag or config could control in which scenarios we stop looking back, and which ones we skip.

Comment on lines -82 to +81
Modules []string
ModulesDirs []string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could say this is the main trigger for the whole PR. We are no longer receiving module identities overrides, but reading it from HEAD on each branch, for each module directory.

Copy link
Member Author

@unmultimedio unmultimedio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style comments

Copy link
Member Author

@unmultimedio unmultimedio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved types functions to its file, mainly syncer.go and bufsync.go. Renamed a few types, mainly maps to reflect keyToValue, and changed some return []type to []*type.

if syncPoint == nil {
return nil, nil
if s.moduleDefaultBranchGetter == nil {
s.logger.Warn(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refrained myself of doing logger.Sugar().Warn( since the output seemed to be more confusing:

With sugar:

bufl alpha repo sync --timeout 0 --all-branches \
  --module module-a \
  --module module-b \
  --create --create-visibility public

...

WARN	read module from HEAD failed, module won't be synced for this branch{error 26 0  read module in branch newcontent, commit 890b1d857a468e788feea6c34c26b7e2168ed75d, directory module-b: module not found}

WARN	default branch validation skipped{default_git_branch 15 0 main <nil>} {module 15 0 bufbuild.internal/jfigueroa/module-a <nil>} {error 26 0  BSR module does not exist}

...

Without sugar:

bufl alpha repo sync --timeout 0 --all-branches \
  --module module-a \
  --create --create-visibility public

...

WARN	read module from HEAD failed, module won't be synced for this branch	{"error": "read module in branch newcontent, commit 890b1d857a468e788feea6c34c26b7e2168ed75d, directory module-a: module not found"}

WARN	default branch validation skipped	{"default_git_branch": "main", "module": "bufbuild.internal/jfigueroa/module-a", "error": "BSR module does not exist"}

...

syncAllBranches bool

commitsToTags map[string][]string // commits:[]tags
branchesToModulesForSync map[string]map[string]bufmoduleref.ModuleIdentity // branch:moduleDir:moduleIdentityInHEAD
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used the suffix *ForSync instead of *ToSync in many variables in this pkg so it wouldn't be confused with the To that separates keys and values in maps, like in this one.

Copy link
Member

@doriable doriable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I know there are some outstanding discussions on UX, but I believe those should be addressed in a follow-up PR. 🚀 🎉

"Only modules specified via '--module' are synced.",
"Syncing all branches is possible using '--all-branches' flag. " +
"By default a single module at the root of the repository is assumed, " +
"for specific module paths use the '--module' flag. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can do this in a follow-up, but we should provide a more detailed example of how the --module flag is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, although that I think it's already addressed in the --module flag docs. I'll check and add it in upcoming PRs if it's not there.

@unmultimedio unmultimedio merged commit 25da2f0 into main Aug 8, 2023
@unmultimedio unmultimedio deleted the jfigueroa/sync-default-behavior branch August 8, 2023 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants