Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix: Prefer SCIP uploads over LSIF uploads#64217

Merged
varungandhi-src merged 2 commits into
mainfrom
vg/scip-over-lsif
Aug 2, 2024
Merged

fix: Prefer SCIP uploads over LSIF uploads#64217
varungandhi-src merged 2 commits into
mainfrom
vg/scip-over-lsif

Conversation

@varungandhi-src

@varungandhi-src varungandhi-src commented Aug 1, 2024

Copy link
Copy Markdown
Contributor

When determining the set of visible uploads at a given commit,
we group uploads from scip-K and lsif-K for ranking purposes
to allow them to shadow each other. Generally, scip-K will end
up shadowing lsif-K, avoiding sending the client a mix of data
from a much older lsif-K index when newer data from a scip-K
index is available.

Fixes https://linear.app/sourcegraph/issue/GRAPH-787

The logic requires some final tie-breaking to avoid non-determinism
in case when there is a scip-K and lsif-K index at the same distance,
which I've done based on the upload_id in descending order
(upload IDs are monotonically increasing). Earlier, some code was
doing tie-breaking based on upload_id in ascending order,
so I've changed that for consistency. This caused some test failures,
so I've updated the corresponding struct literals.

(I've left some PR review comments to describe each modification
to the existing test cases. The PR is ready for review, I've just added
the comments for ease of review.)

Test plan

Added a new dedicated test checking for shadowing behavior.

Changelog

Fixes a bug where old LSIF uploads would also be used for code
navigation even when newer SCIP uploads were available for the
same language, potentially leading to duplicate results in the reference
panel. With this change, scip-go uploads shadow the uploads
for lsif-go, and similarly for other indexers following the scip-X/lsif-X naming
convention.

When determining the set of visible uploads at a given commit,
we group uploads from scip-K and lsif-K for ranking purposes
to allow them to shadow each other. Generally, scip-K will end
up shadowing lsif-K, avoiding a mix of code intel data
@cla-bot cla-bot Bot added the cla-signed label Aug 1, 2024
@github-actions github-actions Bot added team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform labels Aug 1, 2024
}

const findClosestCompletedUploadsFromGraphFragmentCommitGraphQuery = `
const findClosestCompletedUploadsFromGraphFragmentCommitGraphQuery =

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This language= comment is for GoLand language injections to work properly.

makeCommit(3): {2},
makeCommit(4): {2},
makeCommit(5): {2},
makeCommit(5): {4},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an innocuous change because of the upload_id based tie-breaking.

}

if diff := cmp.Diff([]int{2, 3, 5}, getProtectedUploads(t, db, 50)); diff != "" {
if diff := cmp.Diff([]int{2, 3, 4, 5}, getProtectedUploads(t, db, 50)); diff != "" {

@varungandhi-src varungandhi-src Aug 1, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit iffy -- I will look into why this changed tomorrow.

UPDATE: This change is coming up because there is a tagged commit (5) at which upload ID 4 is now visible. So this is correct.

Comment on lines +725 to +728
api.CommitID(makeCommit(5)): {{UploadID: 2, Distance: 2}},
api.CommitID(makeCommit(6)): {{UploadID: 2, Distance: 3}},
api.CommitID(makeCommit(7)): {{UploadID: 3, Distance: 0}},
api.CommitID(makeCommit(8)): {{UploadID: 1, Distance: 4}},
api.CommitID(makeCommit(8)): {{UploadID: 2, Distance: 4}},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are innocuous and are happening because of the upload_id tie-breaking change.

{commit: makeCommit(3), file: "file.ts", rootMustEnclosePath: true, graph: graph, anyOfIDs: []int{2}},
{commit: makeCommit(4), file: "file.ts", rootMustEnclosePath: true, graph: graph, anyOfIDs: []int{2}},
{commit: makeCommit(6), file: "file.ts", rootMustEnclosePath: true, graph: graph, anyOfIDs: []int{1}},
{commit: makeCommit(6), file: "file.ts", rootMustEnclosePath: true, graph: graph, anyOfIDs: []int{2}},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Innocuous / Happening because of the upload_id tie-breaking change.

makeCommit(3): {2},
makeCommit(4): {2},
makeCommit(5): {2},
makeCommit(5): {4},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Innocuous / Happening because of the upload_id tie-breaking change.

}

if diff := cmp.Diff([]int{2, 3, 5}, getProtectedUploads(t, db, 50)); diff != "" {
if diff := cmp.Diff([]int{2, 3, 4, 5}, getProtectedUploads(t, db, 50)); diff != "" {

@varungandhi-src varungandhi-src Aug 1, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit iffy -- I will look into why this changed tomorrow (maybe it's the same as the other place where 4 is getting added).

UPDATE: This commit graph is the same as in the other test where 4 is getting added, which is why the result is the same here.

Comment on lines +378 to +381
makeCommit(5): {2},
makeCommit(6): {2},
makeCommit(7): {3},
makeCommit(8): {1},
makeCommit(8): {2},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Innocuous / Happening because of the upload_id tie-breaking change.

assertCommitsVisibleFromUploads(t, store, uploads, expectedVisibleUploads)

if diff := cmp.Diff([]int{1}, getUploadsVisibleAtTip(t, db, 50)); diff != "" {
if diff := cmp.Diff([]int{2}, getUploadsVisibleAtTip(t, db, 50)); diff != "" {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Innocuous / Happening because of the upload_id tie-breaking change.

@kritzcreek kritzcreek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Luckily postgres's substr indexing isn't confusing at all /s

Nice change!

@varungandhi-src

Copy link
Copy Markdown
Contributor Author

Luckily postgres's substr indexing isn't confusing at all

What do you mean? By it being 1-based instead of 0-based?

@varungandhi-src varungandhi-src merged commit f9bc3cf into main Aug 2, 2024
@varungandhi-src varungandhi-src deleted the vg/scip-over-lsif branch August 2, 2024 05:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants