perf(linter/plugins): remove regex from getCommentsBefore + getCommentsAfter#20475
Merged
graphite-app[bot] merged 1 commit intomainfrom Mar 21, 2026
Conversation
Member
Author
This was referenced Mar 17, 2026
This was referenced Mar 17, 2026
This was referenced Mar 18, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR optimizes comment-adjacency queries by reusing the merged tokens+comments buffer, avoiding per-call source-text slicing/whitespace regex checks and enabling a bounds-check-free forward scan via a sentinel entry.
Changes:
- Exported merged-buffer layout constants from
tokens_and_comments.tsfor reuse by other plugins. - Grew the merged buffer by one entry and wrote a
MERGED_TYPE_TOKENsentinel after the last valid entry. - Reimplemented
getCommentsBefore/getCommentsAfterto binary-search the merged buffer and then walk over consecutive comments usingpos32arithmetic.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| apps/oxlint/src-js/plugins/tokens_and_comments.ts | Exports merged-buffer constants and writes a sentinel entry after the merged data to support efficient scans. |
| apps/oxlint/src-js/plugins/comments_methods.ts | Switches before/after comment collection to operate on the merged tokens+comments buffer for improved performance. |
This was referenced Mar 18, 2026
Contributor
Merge activity
|
77b3b94 to
839f548
Compare
759ddcf to
2f9c95b
Compare
2f9c95b to
1ff472d
Compare
…mentsAfter` (#20475) Previously `sourceCode.getCommentsBefore` and `sourceCode.getCommentsAfter` found all comments directly before / directly after a node by finding comments before / after the node and then checking if anything other than whitespace between them by using a regex search on slices of source text. Replace this with a more performant algorithm. Just find consecutive runs of comments before/after the node in the tokens-and-comments buffer. When a token is found, that's the end of the run of comments. No need for regexes and text searches. This does require generating the tokens-and-comments buffer, which is not so cheap. But it's likely that if a rule calls one of these methods on one node, it'll also call it on many others, so the savings likely outweigh the cost.
1ff472d to
4a22f60
Compare
This was referenced Mar 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Previously
sourceCode.getCommentsBeforeandsourceCode.getCommentsAfterfound all comments directly before / directly after a node by finding comments before / after the node and then checking if anything other than whitespace between them by using a regex search on slices of source text.Replace this with a more performant algorithm. Just find consecutive runs of comments before/after the node in the tokens-and-comments buffer. When a token is found, that's the end of the run of comments. No need for regexes and text searches.
This does require generating the tokens-and-comments buffer, which is not so cheap. But it's likely that if a rule calls one of these methods on one node, it'll also call it on many others, so the savings likely outweigh the cost.