Skip to content

Ranking: standardize ctags kind names before scoring#674

Merged
jtibshirani merged 3 commits into
mainfrom
jtibs/score-kind
Oct 27, 2023
Merged

Ranking: standardize ctags kind names before scoring#674
jtibshirani merged 3 commits into
mainfrom
jtibs/score-kind

Conversation

@jtibshirani

@jtibshirani jtibshirani commented Oct 25, 2023

Copy link
Copy Markdown
Contributor

SCIP ctags can output different kind names than universal-ctags (for example
typeAlias instead of talias). This change makes sure we handle different
names for the same kind. To do so, it refactors the logic so we first match
strings to standard kinds, then decide how these are scored for each language.
That way, you don't need to remember to cover all the possible kind names each
time you adjust scoring for a new language.

Also added basic tests for Ruby and Python to ensure we don't accidentally
change the scoring.

Relates to https://github.com/sourcegraph/sourcegraph/issues/57659

Comment thread ctags/symbol_kind.go
type SymbolKind uint8

const (
Accessor SymbolKind = iota

@jtibshirani jtibshirani Oct 25, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on this PR got me thinking ... it'd be nice to just use SCIP as the exchange format instead of ctags output, and have a tool to map universal-ctags onto SCIP. That would get us closer to an actual spec, unlike the ctags output which has inconsistent naming and an unknown universe of values.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with this. I think this was discussed when scip-ctags was added and that was seen as an part of the end goal.

Comment thread build/e2e_test.go
t.Fatal(err)
}

examplePython, err := os.ReadFile("./testdata/example.py")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow-up, I'll try to pull in SCIP ctags so we can run the same tests using that binary.

@jtibshirani jtibshirani marked this pull request as ready for review October 25, 2023 21:38

@keegancsmith keegancsmith left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Comment thread ctags/symbol_kind.go
type SymbolKind uint8

const (
Accessor SymbolKind = iota

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with this. I think this was discussed when scip-ctags was added and that was seen as an part of the end goal.

@jtibshirani jtibshirani merged commit c23ed05 into main Oct 27, 2023
@jtibshirani jtibshirani deleted the jtibs/score-kind branch October 27, 2023 15:37
keegancsmith pushed a commit that referenced this pull request Nov 1, 2023
SCIP ctags can output different kind names than universal-ctags (for example
`typeAlias` instead of `talias`). This change makes sure we handle different
names for the same kind. To do so, it refactors the logic so we first match
strings to standard kinds, then decide how these are scored for each language.
That way, you don't need to remember to cover all the possible kind names each
time you adjust scoring for a new language.

Also added basic tests for Ruby and Python to ensure we don't accidentally
change the scoring.
jtibshirani added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Jul 31, 2024
To implement `select:symbol.enum` filters, we look at each symbol's
ctags 'kind' and check if it matches the filter value `enum`. We
accidentally didn't include 'enum' in this match logic, so all these
symbols were filtered away.

This PR fixes that, and adds a few improvements:
* Use a shared map between `symbol.LSPKind` and `symbol.SelectKind`, to
avoid drift between these two conversions.
* Audit the ctags mapping from
[sourcegraph/zoekt#674](sourcegraph/zoekt#674)
and add other missing kinds (besides enum)

Closes SPLF-178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants