feat(batches): use "keyword" as default pattern type#63613
Conversation
|
@eseliger I am not very familiar with the Batch Changes code, so I am looking for some guidance. Do you think this approach makes sense? Especially these 2 points:
Are there any other places in Batch Changes where users can define Sourcegraph queries? |
There was a problem hiding this comment.
This looks good to me (although I am far from a batch changes expert!) It makes sense to use the same default for every user, as batch changes are shared, similar to what we do for notebooks.
You could argue that existing batch changes should use the previous default patterntype standard 🤔 That way, when a user reruns the same batch change after the 5.5 upgrade, it will pull up the same repositories. Curious how important you think it would be to implement that (which would require storing patterntypes in the DB?) versus just keeping it simple with an approach like this. Clarifying: I am not saying "we definitely need to do this", genuinely asking a question!
|
So this means patternType is always overwritten to "keyword"? Or can you still specify the patternType inline? I think Julies concern is valid - this would change the behavior of existing batch specs 🤔 I'm not sure how important that property is, but it should probably at least be well called out somewhere. Can we also make sure the examples in
I think that'd be it. |
You can still specify it inline. We just use "keyword" if the user didn't specify anything.
Oops, I didn't know about those. I will take a look. EDIT: |
| @@ -1,3 +1,4 @@ | |||
| version: v2 | |||
There was a problem hiding this comment.
This example is taken as the default when opening a new batch request. By updating the example we nudge users towards using v2.
| func (wr *workspaceResolver) FindDirectoriesInRepos(ctx context.Context, fileName string, batchSpecVersion string, repos ...*RepoRevision) (map[repoRevKey][]string, error) { | ||
| findForRepoRev := func(repoRev *RepoRevision) ([]string, error) { | ||
| query := fmt.Sprintf(`file:(^|/)%s$ repo:^%s$@%s type:path count:all`, regexp.QuoteMeta(fileName), regexp.QuoteMeta(string(repoRev.Repo.Name)), repoRev.Commit) | ||
| query := fmt.Sprintf(`file:(^|/)%s$ repo:^%s$@%s type:path count:all patterntype:keyword`, regexp.QuoteMeta(fileName), regexp.QuoteMeta(string(repoRev.Repo.Name)), repoRev.Commit) |
There was a problem hiding this comment.
Specifying patterntype:keyword is not really necessary here, but we should always be explicit when specifying queries anyway.
| func determineDefaultPatternType(batchSpecVersion string) searchquery.SearchType { | ||
| switch batchSpecVersion { | ||
| case "v1": | ||
| return searchquery.SearchTypeStandard | ||
| case "v2": | ||
| return searchquery.SearchTypeKeyword | ||
| default: | ||
| return searchquery.SearchTypeStandard | ||
| } | ||
| } | ||
|
|
||
| func (wr *workspaceResolver) runSearch(ctx context.Context, query string, batchSpecVersion string, onMatches func(matches []streamhttp.EventMatch)) (err error) { | ||
| defaultPatternType := determineDefaultPatternType(batchSpecVersion) | ||
| req, err := streamhttp.NewRequestWithVersion(wr.frontendInternalURL, query, searchAPIVersion, &defaultPatternType) |
There was a problem hiding this comment.
This is the core of the change. The default pattern type now depends on the batch spec version.
I updated the PR and introduced a batch spec version. This way, existing batch specs won't change behavior. |
This adds documentation for the new batch spec version. Relates to https://github.com/sourcegraph/sourcegraph/pull/63613
This adds documentation for the new batch spec version. Relates to https://github.com/sourcegraph/sourcegraph/pull/63613
eseliger
left a comment
There was a problem hiding this comment.
Code change LGTM, left a few minor comments.
I'm a bit sad that by default, it's easy to miss that you will need to add version: 2 to get our de-facto standard, but it seems like a lot of extra work to make heuristics based on create date here or so - so let's be pragmatic here.
Needs to update src-cli to support this as well.
We should also add this to the docs to make sure most people will eventually use version: 2.
| } | ||
|
|
||
| // For backward compatibility, we allow the version field to be omitted. | ||
| if spec.Version != "" && !slices.Contains(SupportedBatchSpecVersions, spec.Version) { |
There was a problem hiding this comment.
you will have to update src-cli with the latest version of sg/sg/lib and probably add some logic to the query runner there, too so v2 works in CLI execution mode as well
There was a problem hiding this comment.
@eseliger Is the src-cli change a blocker for this PR? Or can we follow-up with the src-cli change even after the release?
There was a problem hiding this comment.
We usually release those in lockstep, but I don't think it has to be in for 5.5.0, we can cut a 5.5.1 as well. We should just make sure to not let that sit for too long, as batch specs shown in the web will no longer work locally (doesn't understand version and silently ignores it)
This adds documentation for the new batch spec version. Relates to https://github.com/sourcegraph/sourcegraph/pull/63613
Here is the PR for the docs. |
jtibshirani
left a comment
There was a problem hiding this comment.
Having a "schema version" for batch changes seems generally useful and works well for this purpose.
When someone creates a batch change through the UI, could we automatically set version: 2? Otherwise it's really non-obvious to users they should be doing this. And if they don't, there will be the inconsistency we're trying to avoid (defaulting to patterntype: standard).
| in Go code with the equivalent but clearer `strconv.Iota` call. | ||
|
|
||
| # Find all repositories that contain the `fmt.Sprintf` statement using structural search | ||
| # Find all repositories that contain the `fmt.Sprintf` statement using regular expression search |
There was a problem hiding this comment.
Oops, thanks for fixing this after my recent PR :)
| } | ||
| if diff := cmp.Diff(want, have); diff != "" { | ||
| t.Fatalf("returned workspaces wrong. (-want +got):\n%s", diff) | ||
| for _, version := range []int{0, 1, 2} { // Test all versions |
There was a problem hiding this comment.
Tiny comment, it's unusual to be using "version: 0" to mean "no version is set!" Maybe we could explicitly detect the "no version" case early and swap it out for a valid version in the set [1, 2]?
There was a problem hiding this comment.
I agree. The batch spec parser would be the right place it seems. However, I tried it out, and this would store the updated spec in the DB instead of the original, which causes some of the tests to fail. I am not familiar enough with Batch Changes to judge if that would violate a fundamental assumption about batch specs so I reverted the change. I think that would be a nice change though, so I will follow-up once I have time to dig a bit deeper.
That's already the case. When you create a batch change, it takes the "hello world" example as starting point and offers the other examples as alternative. I modified all of them to have the version on top. |
This reverts commit c702aba.
Co-authored-by: Julie Tibshirani <julietibs@apache.org>
Ah, right! You even explained in https://github.com/sourcegraph/sourcegraph/pull/63613#discussion_r1665745471... |
This bumps sourcegraph/sourcegraph/lib to pull in the latest changes from https://github.com/sourcegraph/sourcegraph/pull/63613 to support the new `version` field for batch specs. Notes: - To resolve repos, batches calls the GraphQL endpoint `resolveWorkspacesForBatchSpec`, which takes the serialized spec. This means all the logic to resolve repos is outsourced to Sourcegraph. This is great news for this PR because we don't have to worry how `on.RepositoriesMatchingQuery` is interpreted. The code path in which src-cli itself resolves the workspaces was removed in #819. ### Test plan - New unit tests - Manual testing. I tested the following workflow - `src batch new -f test.spec.yaml` (-> spec contains `version: 2`) - `src batch apply -f test.spec.yaml` (-> validation passes, batch spec shows up in Sourcegraph db) - `src batch validate -f test.spec.yaml` (-> validation passes or fails if I change the version to an unsupported value) - `src batch repos -f test.spec.yaml` (-> checked that returned repos match with expected pattern type) Closes SPLF-126
This adds documentation for the new batch spec version. Relates to https://github.com/sourcegraph/sourcegraph/pull/63613 Closes SPLF-130
This is part of the Keyword GA Project.
Batch Changes uses Sourcegraph queries to define the list of repositories on which the batch change will run.
With this change we default to pattern type "keyword" instead of "standard".
To make this a backward compatible change, we also introduce a version identifier to batch specs. Authors can specify
version: 2in the spec, in which case we default to pattern type "keyword". Existing specs (without a specified version) and specs withversion: 1will keep using pattern type "standard".Notes:
Test plan:
version: 2by default.Changelog
2is introduced. Batch specs specifyingversion: 2will use keyword search as the default pattern type to determine repos/workspaces. Batch specs withversion: 1or without version field keep using pattern type "standard".