Skip to content

perf(linter/plugins): reduce allocations for regex tokens#20479

Merged
graphite-app[bot] merged 1 commit intomainfrom
om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens
Mar 21, 2026
Merged

perf(linter/plugins): reduce allocations for regex tokens#20479
graphite-app[bot] merged 1 commit intomainfrom
om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens

Conversation

@overlookmotel
Copy link
Member

@overlookmotel overlookmotel commented Mar 17, 2026

Optimize storing regex tokens so they can have their regex property set back to undefined at the end of linting a file. Previously we pushed the Tokens into an Array and then reset the array with tokensWithRegex.length = 0; at the end.

This had a problem: When you set an array's length to 0, V8 releases the whole backing allocation, which means that on the next file, it has to allocate again. Worse, that new allocation is made in "new space", and if there's a lot of object creation in rules while linting the file, it will be copied during minor GC, and may even get graduated to "old space". All of that is expensive.

Instead, use a Uint32Array to store indexes, for the same reasons as in #20477.

Copy link
Member Author

overlookmotel commented Mar 17, 2026


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added A-linter Area - Linter A-cli Area - CLI A-linter-plugins Area - Linter JS plugins labels Mar 17, 2026
@github-actions github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label Mar 17, 2026
@overlookmotel overlookmotel self-assigned this Mar 18, 2026
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens branch from 66d4271 to 208bee0 Compare March 18, 2026 09:45
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch from 9e16a5a to 9254068 Compare March 18, 2026 09:45
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens branch from 208bee0 to 5374681 Compare March 18, 2026 16:28
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch from 9254068 to 1c54851 Compare March 18, 2026 16:28
@overlookmotel overlookmotel marked this pull request as ready for review March 18, 2026 17:02
@overlookmotel overlookmotel requested a review from camc314 as a code owner March 18, 2026 17:02
Copilot AI review requested due to automatic review settings March 18, 2026 17:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes regex-token bookkeeping in apps/oxlint/src-js/plugins/tokens.ts by replacing an Array of token references with a reusable Uint32Array of token indexes, aiming to reduce per-file allocations and GC overhead during linting.

Changes:

  • Track regex tokens via tokensWithRegexIndexes: Uint32Array + activeTokensWithRegexCount instead of Token[].
  • Reuse regexObjects entries keyed by activeTokensWithRegexCount, and clear token.regex during resetTokens() using the stored indexes.

@graphite-app graphite-app bot added the 0-merge Merge with Graphite Merge Queue label Mar 20, 2026
@overlookmotel overlookmotel removed the 0-merge Merge with Graphite Merge Queue label Mar 20, 2026
@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label Mar 20, 2026 — with Graphite App
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch from 7c469ec to 3426e69 Compare March 21, 2026 12:21
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens branch from 5374681 to 2090432 Compare March 21, 2026 12:21
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch from 3426e69 to dfd5e3f Compare March 21, 2026 12:32
@overlookmotel overlookmotel force-pushed the om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens branch from 2090432 to 94eb885 Compare March 21, 2026 12:32
@graphite-app
Copy link
Contributor

graphite-app bot commented Mar 21, 2026

Merge activity

Optimize storing regex tokens so they can have their `regex` property set back to `undefined` at the end of linting a file. Previously we pushed the `Token`s into an `Array` and then reset the array with `tokensWithRegex.length = 0;` at the end.

This had a problem: When you set an array's length to 0, V8 releases the whole backing allocation, which means that on the next file, it has to allocate again. Worse, that new allocation is made in "new space", and if there's a lot of object creation in rules while linting the file, it will be copied during minor GC, and may even get graduated to "old space". All of that is expensive.

Instead, use a `Uint32Array` to store indexes, for the same reasons as in #20477.
@graphite-app graphite-app bot force-pushed the om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens branch from 94eb885 to 4ee80ac Compare March 21, 2026 12:47
@graphite-app graphite-app bot force-pushed the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch from dfd5e3f to 9c7a267 Compare March 21, 2026 12:47
graphite-app bot pushed a commit that referenced this pull request Mar 21, 2026
… accessed `loc` (#20480)

Similar to #20479. `tokensWithLoc` and `commentsWithLoc` contain tokens/comments on which `loc` property has been accessed.

Previously these arrays were grown with `.push(token)`, and shrunk at end of linting file with `tokensWithLoc.length =  0;`. Problem with that is that setting length to 0 frees the array's backing allocation, which means it has to reallocate again on next file when a token's `loc` is accessed.

Instead, never shrink these arrays, and track the active length in separate variables. After warm-up over first batch of files, these arrays will graduate to "old space" and just sit there without further allocations.

Unlike #20479, we don't use `Uint32Array`s as `loc` is calculated lazily, and `Token`'s / `Comment`s don't know what their index is in the `cachedTokens` / `cachedComments` arrays. Adding an `#index` field to `Token` and `Comment` would bloat every instance of these classes by 8 bytes.
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Mar 21, 2026
Base automatically changed from om/03-17-perf_linter_plugins_remove_bounds_checks_on_regex_tokens to main March 21, 2026 12:51
@graphite-app graphite-app bot merged commit 9c7a267 into main Mar 21, 2026
24 checks passed
@graphite-app graphite-app bot deleted the om/03-17-perf_linter_plugins_reduce_allocations_for_regex_tokens branch March 21, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cli Area - CLI A-linter Area - Linter A-linter-plugins Area - Linter JS plugins C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants