Skip to content

perf(lexer): fix perf of Token::set_* methods on Rust 1.95.0#21659

Merged
graphite-app[bot] merged 1 commit intomainfrom
om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0
Apr 23, 2026
Merged

perf(lexer): fix perf of Token::set_* methods on Rust 1.95.0#21659
graphite-app[bot] merged 1 commit intomainfrom
om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0

Conversation

@overlookmotel
Copy link
Copy Markdown
Member

@overlookmotel overlookmotel commented Apr 22, 2026

Token's set_* methods used safe bitwise ops to write "fields" of Token.

Unfortunately this regressed heavily in Rust 1.95.0, due to a bug in LLVM. Methods which were a single scalar op turned into a string of heavy SIMD ops, impacting performance.

Re-implement these methods using unsafe pointer manipulation, which recovers the original assembly, and reverses the perf regression - gains 3% on parser benchmarks.

The original implementations are left in the code as comments, in case we want to revert to them once the LLVM bug is fixed.

Copy link
Copy Markdown
Member Author

overlookmotel commented Apr 22, 2026


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@overlookmotel overlookmotel marked this pull request as ready for review April 22, 2026 23:20
Copilot AI review requested due to automatic review settings April 22, 2026 23:20
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 22, 2026

Merging this PR will improve performance by 3.27%

⚡ 4 improved benchmarks
✅ 44 untouched benchmarks
⏩ 3 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation parser[cal.com.tsx] 26.5 ms 25.6 ms +3.27%
Simulation parser[RadixUIAdoptionSection.jsx] 83.8 µs 81.3 µs +3.05%
Simulation parser[react.development.js] 1.3 ms 1.3 ms +3.24%
Simulation parser[binder.ts] 3.4 ms 3.3 ms +3.02%

Comparing om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0 (f7a428f) with main (30e0ad3)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a Rust 1.95.0 performance regression in the lexer by replacing Token’s safe bitfield-update logic with unsafe, direct memory writes to restore the previously tight codegen.

Changes:

  • Reworks Token::set_* methods to write “fields” via pointer-based stores instead of u128 bitwise masking/shifting.
  • Adds compile-time layout/alignment assertions to justify the pointer-based field access.
  • Preserves the previous safe implementations as comments for potential future rollback.

Comment thread crates/oxc_parser/src/lexer/token.rs
Comment thread crates/oxc_parser/src/lexer/token.rs Outdated
Comment thread crates/oxc_parser/src/lexer/token.rs Outdated
Comment thread crates/oxc_parser/src/lexer/token.rs Outdated
@overlookmotel overlookmotel marked this pull request as draft April 22, 2026 23:29
@overlookmotel overlookmotel force-pushed the om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0 branch from 9a4e526 to 1068f82 Compare April 22, 2026 23:59
@overlookmotel overlookmotel marked this pull request as ready for review April 23, 2026 00:00
@overlookmotel overlookmotel requested a review from Copilot April 23, 2026 00:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@overlookmotel
Copy link
Copy Markdown
Member Author

@copilot See response to your critique in #21659 (comment). Do you agree/disagree?

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

@copilot See response to your critique in #21659 (comment). Do you agree/disagree?

Agreed — the strict-aliasing/TBAA concern I raised was incorrect. I was incorrectly applying C's type-based alias analysis rules to Rust. Rust's aliasing model is based on borrows (Stacked Borrows / Tree Borrows), not on the declared types of pointers, so reading the same bytes through both u128 and u32/bool pointers is well-defined. The fact that Miri passes confirms the code is sound.

@overlookmotel overlookmotel force-pushed the om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0 branch from f1248de to f7a428f Compare April 23, 2026 00:11
@overlookmotel overlookmotel added A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance labels Apr 23, 2026
@camc314 camc314 added the 0-merge Merge with Graphite Merge Queue label Apr 23, 2026
Copy link
Copy Markdown
Contributor

@camc314 camc314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I meant to do this following #21523 , but didn't manage to

Copy link
Copy Markdown
Contributor

camc314 commented Apr 23, 2026

Merge activity

`Token`'s `set_*` methods used safe bitwise ops to write "fields" of `Token`.

Unfortunately this regressed heavily in Rust 1.95.0, due to a bug in LLVM. Methods which were a single scalar op turned into a string of heavy SIMD ops, impacting performance.

* #21509
* rust-lang/rust#155422

Re-implement these methods using unsafe pointer manipulation, which recovers the original assembly, and reverses the perf regression - gains 3% on parser benchmarks.

The original implementations are left in the code as comments, in case we want to revert to them once the LLVM bug is fixed.
@graphite-app graphite-app Bot force-pushed the om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0 branch from f7a428f to 2290f31 Compare April 23, 2026 07:28
@graphite-app graphite-app Bot merged commit 2290f31 into main Apr 23, 2026
28 checks passed
@graphite-app graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label Apr 23, 2026
@graphite-app graphite-app Bot deleted the om/04-23-perf_lexer_fix_perf_of_token_set__methods_on_rust_1.95.0 branch April 23, 2026 07:35
camc314 pushed a commit that referenced this pull request Apr 27, 2026
### 💥 BREAKING CHANGES

- 502e804 ast: [**BREAKING**] Reduce size of `TSTypePredicateName`
(#21711) (overlookmotel)
- 5651539 ast: [**BREAKING**] Reduce size of `JSXExpression` (#21710)
(overlookmotel)
- c44e280 ast: [**BREAKING**] Reduce size of `ArrayExpressionElement`
(#21709) (overlookmotel)
- c5b3deb syntax: [**BREAKING**] Remove `CommentNodeId` (#21679)
(overlookmotel)

### 🚀 Features

- b738a39 allocator: Add `Allocator::cursor_ptr` method (#21773)
(overlookmotel)
- 678767e ast: Generate node_id accessors for AST enum wrappers (#21653)
(camc314)
- f091d77 minifier: Inline constant spread elements into arrays (#21095)
(Armano)

### 🐛 Bug Fixes

- 0d608c2 minifier: Preserve raw CR in template literals (#21645)
(Dunqing)
- a889ea9 minifier: Track pure functions in DCE mode (#21722) (Dunqing)
- 674dfac allocator: `Arena` retry allocation when chunk size approaches
maximum (#21777) (overlookmotel)
- f130cc0 allocator: Fix arithmetic overflow in
`Arena::new_chunk_memory_details` (#21745) (overlookmotel)
- b9bf239 allocator: Fix UB in `Arena::grow_zeroed` (#21739)
(overlookmotel)
- d2b9389 allocator: Clippy warning when building without `testing`
feature (#21681) (camc314)
- 503dc86 codegen: Map sourcemaps from visible output starts (#21662)
(Dunqing)
- c92bd3b transformer: Use SPAN for synthesized helper calls to prevent
comment misattribution (#21578) (Dunqing)
- 0d80441 codegen: Add mapping before printing `#` for private ident
(#21619) (camc314)

### ⚡ Performance

- 9fa362e napi/parser: Do not generate tokens except in tests (#21811)
(overlookmotel)
- 0044392 allocator: Reduce branches when allocating new chunk (#21776)
(overlookmotel)
- 7896bd0 allocator: `Allocator::used_bytes` do not use chunk iterator
(#21771) (overlookmotel)
- a5c562f allocator: Remove check in `Arena::new_chunk_memory_details`
(#21750) (overlookmotel)
- 35bbe1f allocator: `Arena` use unchecked size round up where
guaranteed no overflow (#21743) (overlookmotel)
- ffe229b allocator: Remove unnecessary check from
`Arena::try_alloc_layout_slow_impl` (#21732) (overlookmotel)
- 72fece5 allocator: Use `NonNull::offset_from_unsigned` in
`Arena::chunk_capacity` (#21731) (overlookmotel)
- cab32ae ast: Add `#[inline(always)]` to `node_id` methods on enums
with all variants unboxed (#21707) (overlookmotel)
- b179688 parser: Allocate `TriviaBuilder` comments in the arena
(#21512) (Boshen)
- 2290f31 lexer: Fix perf of `Token::set_*` methods on Rust 1.95.0
(#21659) (overlookmotel)
- 1b58029 allocator: Move code into cold path in `Arena::alloc_layout`
(#21622) (overlookmotel)
- 3cf7cef allocator: Reduce instructions on allocation hot path (#21510)
(overlookmotel)

### 📚 Documentation

- ce65070 data_structures: Document why `as_ref` and `as_mut` on
`NonNullConst` and `NonNullMut` take `self` (#21800) (overlookmotel)
- 93b7dbd allocator: Improve doc comments for `ChunkFooter` (#21733)
(overlookmotel)
- 295db8d transformer: Fix comment (#21717) (overlookmotel)
- 5c93af8 ast: Add comments explaining `#[inline(always)]` to `node_id`
methods on enums (#21706) (overlookmotel)
- e4cea25 transform: Use the `node:` namespace in the example (#19998)
(루밀LuMir)

### 🛡️ Security

- d8076c9 deps: Update rolldown (#21639) (renovate)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants