feat: incomplete datapoints can now resolve the affected repositories by bahrmichael · Pull Request #62756 · sourcegraph/sourcegraph-public-snapshot

bahrmichael · 2024-05-17T09:41:41Z

Closes https://github.com/sourcegraph/sourcegraph/issues/62578

For https://github.com/sourcegraph/sourcegraph/issues/62295

Previously with our GraphQL api, you couldn't figure out which repositories caused incomplete datapoints. With this change you can now provide an argument to the incompleteDatapoints to not aggregate points for repositories, and then resolve the repositories for each datapoint.

This PR is needed to help debug incomplete datapoints in Code Insights. When customers create Code Insights for a large number of repositories, it's hard to understand how big the impact of incomplete datapoints is, and which repositories those issues are coming from. If you don't have access to the logs it's basically impossible to isolate problematic repositories.

Queries work as before, when you don't add the aggregateRepositories=false parameter or resolve the repository.

When you add the aggregateRepositories=false parameter and resolve the repository, you get individual datapoints for each repository that had a problem.

If you set aggregateRepositories=true and attempt to resolve the repository, it will be null.

Test plan

Existing code paths are covered by CI
I will add more tests if this approach is accepted

bahrmichael · 2024-05-17T11:57:55Z

The license check is a known issue: https://sourcegraph.slack.com/archives/C04MYFW01NV/p1715937672950199

camdencheek · 2024-05-24T13:32:37Z

+        By default, incomplete datapoints are aggregated across all repositories.
+        Setting this to false will allow resolving the repository.
+        """
+        aggregateRepositories: Boolean = true


Q: now that repositories is an array, do we still need this parameter? If a client doesn't care about the repository list, they can just exclude that from the list of fields in their query. Excluding this also removes the (documented but still maybe surprising) dependency between the repositories field and this argument

Yes! Thank you for the reminder. I was able to clean it up, and things seem to work as expected. Since I can't find any problematic to the store method, it should be good as long as CI passes.

camdencheek · 2024-05-24T13:38:38Z

+			if repoId.Valid {
+				mappedRepoIds[i] = int(repoId.Int64)
+			}


Q: the DB schema says repo_id is nullable, but that's kinda surprising to me. Do you understand why that is?

I found https://github.com/sourcegraph/sourcegraph/pull/45282 which inserts null here and mentions global queries. The repoId and repoName should be available though, based on the types that this incomplete insert runs on. I haven't found any places where the repoId and repoName on RecordSeriesPointArgs are not set. Maybe it's to reduce the number of inserts for global queries?

In the backend documentation it sounds like there should also be repo information no matter if it's global or not.

https://github.com/sourcegraph/sourcegraph/blob/d4a6b274037c67e3b76250b2c67f7c80df34da51/doc/dev/background-information/insights/backend.md?plain=1#L144-L145

Good find!

Not blocking, just thinking out loud to try to understand this better. What is a global code insight? It kinda makes sense that a global job wouldn't have a repo ID because it's running against everything, but when would we do that? Maybe there's a special case for an insight that only runs against public repositories, so we know that all users can view all the data, and don't need to keep track of which repo the points are for?

This PR updates the documentation to explain how users can use a new GraphQL field introduced with https://github.com/sourcegraph/sourcegraph/pull/62756 to identify repositories that cause incomplete datapoints. For https://github.com/sourcegraph/sourcegraph/issues/62295 ## Pull Request approval Although pull request approval is not enforced for this repository in order to reduce friction, merging without a review will generate a ticket for the docs team to review your changes. So if possible, have your pull request approved before merging.

feat: incomplete datapoints can now resolve the affected repository

abba7d5

cla-bot Bot added the cla-signed label May 17, 2024

bahrmichael requested review from camdencheek and mike-r-mclaughlin May 17, 2024 09:42

bahrmichael added 2 commits May 17, 2024 11:45

fix: run bazel configure

3caad37

fix: bazel test failure

ab01c4f

bahrmichael added 2 commits May 21, 2024 16:40

Merge branch 'main' into bahrmichael/62578

25b1986

Merge branch 'main' into bahrmichael/62578

f23fe84

camdencheek reviewed May 22, 2024

View reviewed changes

Comment thread cmd/frontend/graphqlbackend/insights.graphql Outdated

camdencheek reviewed May 22, 2024

View reviewed changes

Comment thread internal/insights/store/store.go Outdated

bahrmichael and others added 3 commits May 24, 2024 12:23

feat: switch from n*m to n datapoints

29369c9

chore: add tests

5b7ecd0

Merge branch 'main' into bahrmichael/62578

98f7f87

camdencheek approved these changes May 24, 2024

View reviewed changes

bahrmichael and others added 5 commits May 24, 2024 16:14

chore: run bazel configure

1c4682f

chore: remove aggregation parameter

beeeebc

Merge branch 'main' into bahrmichael/62578

3f81e6e

fix: update tests

4d950d6

chore: add changelog entry

17ab32d

bahrmichael enabled auto-merge (squash) May 27, 2024 10:27

bahrmichael changed the title ~~feat: incomplete datapoints can now resolve the affected repository~~ feat: incomplete datapoints can now resolve the affected repositories May 27, 2024

bahrmichael merged commit ff9ef6f into main May 27, 2024

bahrmichael deleted the bahrmichael/62578 branch May 27, 2024 10:32

bahrmichael mentioned this pull request May 28, 2024

feat: explain repository identification for incomplete datapoints sourcegraph/docs#357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: incomplete datapoints can now resolve the affected repositories#62756

feat: incomplete datapoints can now resolve the affected repositories#62756
bahrmichael merged 13 commits into
mainfrom
bahrmichael/62578

bahrmichael commented May 17, 2024 •

edited

Loading

Uh oh!

bahrmichael commented May 17, 2024

Uh oh!

Uh oh!

Uh oh!

camdencheek May 24, 2024

Uh oh!

bahrmichael May 24, 2024

Uh oh!

camdencheek May 24, 2024

Uh oh!

bahrmichael May 24, 2024

Uh oh!

camdencheek May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bahrmichael commented May 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Uh oh!

bahrmichael commented May 17, 2024

Uh oh!

Uh oh!

Uh oh!

camdencheek May 24, 2024

Choose a reason for hiding this comment

Uh oh!

bahrmichael May 24, 2024

Choose a reason for hiding this comment

Uh oh!

camdencheek May 24, 2024

Choose a reason for hiding this comment

Uh oh!

bahrmichael May 24, 2024

Choose a reason for hiding this comment

Uh oh!

camdencheek May 24, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bahrmichael commented May 17, 2024 •

edited

Loading