Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Code monitors: fix a bunch of stuff#57546

Merged
camdencheek merged 7 commits into
mainfrom
cc/monitors-updates
Oct 12, 2023
Merged

Code monitors: fix a bunch of stuff#57546
camdencheek merged 7 commits into
mainfrom
cc/monitors-updates

Conversation

@camdencheek

@camdencheek camdencheek commented Oct 11, 2023

Copy link
Copy Markdown
Member

I apologize for the multi-focused PR, but a few of these fixes bled into each other.

Comments inline.

Test plan

Manually tested that monitors still work. Existing tests cover standard cases, and I updated the test that checks that a failure logs a message.

This removes the feature flag `cc-repo-aware-monitors` that has been
enabled by default for more than a year.
@cla-bot cla-bot Bot added the cla-signed label Oct 11, 2023
Comment on lines -128 to -131
userID, orgID, err := graphqlbackend.UnmarshalNamespaceToIDs(args.Monitor.Namespace)
if err != nil {
return err
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved this up and out of the transaction to make it more obvious what needed to be in the transaction

return err
}

if featureflag.FromContext(ctx).GetBoolOr("cc-repo-aware-monitors", true) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature flag has been enabled by default for over a year

// On subsequent runs, this allows us to treat all new repos or sets of args as something new that should
// be searched from the beginning.
func Snapshot(ctx context.Context, logger log.Logger, db database.DB, query string, monitorID int64) error {
func Snapshot(ctx context.Context, logger log.Logger, db database.DB, query string) (map[api.RepoID][]string, error) {

@camdencheek camdencheek Oct 11, 2023

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the biggest change. Rather than Snapshot saving the results of the snapshot in the database, it returns the set of commitIDs resolved for each searched repo. This makes it possible to run the "UpsertLastSearched" in a transaction even though Snapshot itself can't be run in a transaction itself because search jobs will use the transaction concurrently. It also, conveniently, makes it easier to test.

Comment on lines +77 to +79
if db.Handle().InTransaction() {
return nil, errors.New("Snapshot cannot be run in a transaction")
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than return a super cryptic error when the transaction is used concurrently (e.g. bad connection * conn busy), just fail up front if someone tries to run this in a transaction.

Comment on lines -106 to -109
// HACK(camdencheek): limit the concurrency of the commit search job
// because the db passed into this function might actually be a transaction
// and transactions cannot be used concurrently.
planJob = limitConcurrency(planJob)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we explicitly don't support using a transaction here, it's fine to run the jobs for multiple repos concurrently so we can get rid of this hack.

Comment on lines -238 to -245
func gqlURL(queryName string) (string, error) {
u, err := url.Parse(internalapi.Client.URL)
if err != nil {
return "", err
}
u.Path = "/.internal/graphql"
u.RawQuery = queryName
return u.String(), nil

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was unused

Comment on lines +205 to +209
// NOTE(camdencheek): we want to always save the "last searched" commits
// because if we stream results, the user will get a notification for them
// whether or not there was an error and forcing a re-search will cause the
// user to get repeated notifications for the same commits. This makes code
// monitors look very broken, and should be avoided.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the primary tradeoff. Previously, we would retry searches that failed for whatever reason. This just skips commits that failed, ensuring we always make forward progress and don't notify users multiple times about the same commit. This means it's possible to miss some commits, but I think that's a fair trade for not spamming users and looking super broken, especially since many types of errors are not temporary.

A real solution here would probably look like retrying on error, but since we stream the results upwards, that's a lot more work than I'm willing to put in right now.

it := repos.Iterator(ctx, j.RepoOpts)

p := pool.New().WithContext(ctx).WithMaxGoroutines(j.Concurrency).WithFirstError()
p := pool.New().WithContext(ctx).WithMaxGoroutines(4).WithFirstError()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that concurrency doesn't need to be configurable, just inline the default

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own learning, is there a reason behind the number 4 for the concurrency value?

@camdencheek camdencheek marked this pull request as ready for review October 12, 2023 00:25
@camdencheek camdencheek requested a review from a team October 12, 2023 00:25
@sourcegraph-bot

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff f17dd8b...e2554ef.

Notify File(s)
@jtibshirani internal/search/commit/commit.go
internal/search/job/jobutil/job.go
@keegancsmith internal/search/commit/commit.go
internal/search/job/jobutil/job.go

@camdencheek camdencheek merged commit afd3156 into main Oct 12, 2023
@camdencheek camdencheek deleted the cc/monitors-updates branch October 12, 2023 15:36
@camdencheek camdencheek added backport 5.2 backport/bugfix Standard patches to fix bugs labels Oct 13, 2023
sourcegraph-release-bot pushed a commit that referenced this pull request Oct 13, 2023
- Fixes an issue where creating a monitor will fail due to concurrent transaction usage
- Fixes an issue where a timeout will cause a monitor to continually get re-run
- Cleans up some old feature flags
- Cleans up some dead code

(cherry picked from commit afd3156)
camdencheek added a commit that referenced this pull request Oct 13, 2023
Code monitors: fix a bunch of stuff (#57546)

- Fixes an issue where creating a monitor will fail due to concurrent transaction usage
- Fixes an issue where a timeout will cause a monitor to continually get re-run
- Cleans up some old feature flags
- Cleans up some dead code

(cherry picked from commit afd3156)

Co-authored-by: Camden Cheek <camden@ccheek.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Code monitors: canceled context causes monitor to retry ad nauseum Code monitor error on set up

3 participants