Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Add Tree-sitter-based syntax-highlighting support for Dart#58480

Merged
varungandhi-src merged 20 commits into
sourcegraph:mainfrom
matthewnitschke:tree-sitter-dart-impl
Apr 15, 2024
Merged

Add Tree-sitter-based syntax-highlighting support for Dart#58480
varungandhi-src merged 20 commits into
sourcegraph:mainfrom
matthewnitschke:tree-sitter-dart-impl

Conversation

@matthewnitschke

@matthewnitschke matthewnitschke commented Nov 21, 2023

Copy link
Copy Markdown
Contributor

Implementation of tree-sitter-dart as discussed in #57880 and #57987

Fixes #57880

Test plan

Added new snapshot tests

@cla-bot

cla-bot Bot commented Nov 21, 2023

Copy link
Copy Markdown

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@cla-bot cla has been signed

@matthewnitschke matthewnitschke marked this pull request as ready for review November 21, 2023 22:50
@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@keynmol Snapshots have been generated, and this is now ready for review!

@varungandhi-src varungandhi-src changed the title Implemented tree-sitter-dart Add Tree-sitter-based syntax-highlighting support for Dart Nov 22, 2023
@varungandhi-src

Copy link
Copy Markdown
Contributor

@matthewnitschke I've re-titled this PR and the linked issue to make it clearer that they're related to syntax highlighting, not search-based code nav for locals (which we also support via Tree-sitter), and not related to implementing the grammar itself (tree-sitter-x is conventionally used for the grammar names).

@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@matthewnitschke I've re-titled this PR and the linked issue to make it clearer that they're related to syntax highlighting, not search-based code nav for locals (which we also support via Tree-sitter), and not related to implementing the grammar itself (tree-sitter-x is conventionally used for the grammar names).

Interesting, I was hoping I'd be able to utilize symbol searching in dart with this update, but that does not sound like the case. Is there a future where this functionality (referring to type:symbol searching) could be provided by scip? If not, could you point me in the location of where this is currently handled within the sourcegraph codebase?

@varungandhi-src

Copy link
Copy Markdown
Contributor

@matthewnitschke yes, type:symbol search can also be handled via SCIP. For that, take a look at the following things:

  • Usages of BundledParser type/cases, including the place with create_tags_configuration
  • scip-tags.scm files (we'd need a similar one for Dart)

I would recommend handling that in a separate PR, since this PR is already quite big.

I think that should be enough, but if we need to add separate Kind cases to SCIP for handling Dart constructs, we may need some extra work in the Zoekt repo to handle those.

Comment on lines +133 to +135
List<String> fruits = ['apple', 'banana', 'cherry'];
// ^^^^ IdentifierType
// ^ IdentifierOperator

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, the < and > here should be marked as PunctuationBracket as they're acting like brackets, not like operators.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be resolved within: UserNobody14/tree-sitter-dart#61

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tree-sitter-dart has been updated to correctly mark these as @punction.bracket, but it appears as though the test framework no-longer annotates these markings

Is this intentional? or is there something wrong within the highlights.scm side of things?

Comment on lines +21 to +23
int age;
// ^^^ IdentifierType
// ^^^ IdentifierBuiltin

@varungandhi-src varungandhi-src Nov 28, 2023

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fields like age should be Identifier not IdentifierBuiltin

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +203 to +207
var numbers = <int>[1, 2, 3, 4, 5];
// ^^^ Keyword
// ^^^^^^^ IdentifierBuiltin
// ^ IdentifierOperator
// ^ IdentifierOperator

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, the < and > here should be treated as punctuation brackets instead of IdentifierOperator

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as https://github.com/sourcegraph/sourcegraph/pull/58480#discussion_r1555004826, doesn't look like @punction.bracket is included anymore

@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@varungandhi-src Oof, looks like the existing tree-sitter-dart implementation is not in the place I was hoping it was, I'll need to spend some time digging into the project

As far as fixes go, would the preferred route be fixing the original tree-sitter-dart, and then updating sourcegraph's fork once its ready? Just looking to address this stuff in the cleanest way possible

@varungandhi-src

Copy link
Copy Markdown
Contributor

As far as fixes go, would the preferred route be fixing the original tree-sitter-dart, and then updating Sourcegraph's fork once its ready?

Yes, that would be the best way to do things.

@cla-bot cla-bot Bot added the cla-signed label Apr 6, 2024
tree-sitter-rust = "0.20.3"
tree-sitter-typescript = "0.20.2"

tree-sitter-dart = { git = "https://github.com/UserNobody14/tree-sitter-dart", rev = "923fbe0d3e62340ea78422551166ce946bc05222" }

@matthewnitschke matthewnitschke Apr 7, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mirrored the changed applied to https://github.com/sourcegraph/tree-sitter-dart, in the upstream: UserNobody14/tree-sitter-dart#67

Unsure if there were other reasons for sourcegraph/tree-sitter-dart, but I think that repo can be archived now

@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@varungandhi-src @keynmol

A bit of a delay on this one... but I've gone ahead and addressed all of varun's original comments in the upstream tree-sitter-dart

Should be ready for another round of reviews

; (hex_floating_point_literal)
] @number

(symbol_literal) @symbol

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see const MATCHES_TO_SYNTAX_KINDS: &[(&str, SyntaxKind)] in the Rust code, there is no SyntaxKind mapping for symbol, unlike string or constant.builtin. So we should use something else here.

; --------------------
[
(assert_builtin)
(break_statement)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(break_statement)
(break_builtin)

Otherwise we'll end up marking the full range of the statement as a keyword.

Please add test cases for all the non-string keywords at least.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved and added additional test cases for these keyword types

//^^^^^^ IdentifierType
// ^^^^^^ IdentifierType
// ^^^^^^^^^ IdentifierFunction
return Future.delayed(Duration(seconds: 2), () => "Data loaded");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're missing highlighting for seconds - is that deliberate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@varungandhi-src varungandhi-src left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the minor remaining change for null. I'll do a quick end-to-end test before merging.

@varungandhi-src

Copy link
Copy Markdown
Contributor

@matthewnitschke I've asked internally if we can trigger CI for PRs from forks. If not, I can recreate the PR on a separate branch.

@varungandhi-src

Copy link
Copy Markdown
Contributor

@matthewnitschke can you rebase on top of the latest changes on main? I think the base being outdated might be causing the backcompat tests to fail.

@matthewnitschke

Copy link
Copy Markdown
Contributor Author

@matthewnitschke can you rebase on top of the latest changes on main? I think the base being outdated might be causing the backcompat tests to fail.

@varungandhi-src Merged main and CI's looking green!

@varungandhi-src

varungandhi-src commented Apr 15, 2024

Copy link
Copy Markdown
Contributor

The CI passed, but there is a 'Semgrep OSS' mandatory check that isn't being triggered for some reason, and I can't merge without that. I've asked in our internal Slack channel about it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tree-sitter-based syntax-highlighting support for Dart

2 participants