perf(parser): split TriviaBuilder::handle_token hot/cold paths#22415
Merged
Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
Member
Author
Merge activity
|
## Summary `TriviaBuilder::handle_token` is called by `Lexer::finish_token` per token. The original body mixed two paths: - **Hot path** — three field writes (`previous_kind`, `saw_newline`, `saw_newline_for_comment`). - **Cold path** — a for-loop that retroactively marks any pending comments as leading comments of the current token. Only fires when there are unprocessed comments before a token, which is rare on a per-token basis. LLVM wasn't inlining the function — it appeared as a separate symbol at 1.46% parse self-time in xctrace Time Profiler. Splitting the for-loop into a `#[cold]` helper and marking the wrapper `#[inline]` lets the hot path fuse into `finish_token`'s body and keeps the cold attachment loop out of the inlined code. No behavior change: the same fields are written and the same comments attached in the same order. test262 conformance unchanged: 47086/47086 positive, 4588/4588 negative. ## Measurements Wall-time on parse-only loop (parse N times, reusing allocator), `--profile release-with-debug`, median of 3 runs: | file | before | after | delta | |---|---|---|---| | cal.com.tsx (1.0M) | 3.97 ms/iter | 3.80 ms/iter | **-4.3%** | | checker.ts (2.8M) | 7.69 ms/iter | 7.56 ms/iter | -1.7% | | antd.js (6.7M) | 15.46 ms/iter | 15.23 ms/iter | -1.5% | | pdf.mjs (554K) | 2.27 ms/iter | 2.27 ms/iter | noise | | binder.ts (189K) | 418 µs/iter | 420 µs/iter | noise | Modest but consistent across the larger inputs. Best on TSX (JSX brings more tokens per byte of source, so more `handle_token` calls). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
03a1071 to
83679ea
Compare
overlookmotel
added a commit
that referenced
this pull request
May 15, 2026
### 🚀 Features - bc91a17 codegen: Expose `Codegen::with_source_type` method (#22432) (camc314) ### 🐛 Bug Fixes - 5ac7e79 minifier: Drop unused-var-init pure IIFEs and preserve annotation for downstream (#22349) (Dunqing) - 4ab57eb allocator: Fixed-size allocators use `VirtualAlloc` on Windows (#22124) (overlookmotel) - 66d77eb allocator: Fix segfault on Linux MUSL with fixed-size allocators (#22388) (overlookmotel) - b8fbc1f transformer/object-rest-spread: Correct scope id when moving bindings (#22419) (camc314) - 18edc2c codegen: Keep `Object.defineProperty` property name as plain string in minify (#22400) (Dunqing) - dda33de transformer/explicit-resource-management: Align lexical binding scopes (#22320) (camc314) - 8e79de8 transformer: Preserve for-await statement bodies (#22361) (camc314) - 0cba210 transformer/class: Replace `new.target` in static blocks (#22360) (camc314) - 67ab1c9 transformer/es2018/for-await: Hoist for-await generated bindings (#22355) (camc314) - c3ceb4a transformer/object-rest-spread: Use hoisted scope for `for-of` temp refs (#22347) (camc314) ### ⚡ Performance - 73a9043 allocator/bitset: Avoid temp heap `String` allocation (#22403) (camc314) - 8b2f4f9 transformer/object-rest-spread: Collect `Vec<SymbolId` over `Vec<BindingIdentifier>` (#22418) (camc314) - 83679ea parser: Split TriviaBuilder::handle_token hot/cold paths (#22415) (Boshen) - 2c7d781 codegen: Inline identifier-name accessors (#22411) (Boshen) - 618bc76 diagnostics: Inline `OxcDiagnosticInner` to avoid heap allocation (#22406) (Boshen) - 0b4e158 parser: Reserve cap `2` for sequence expressions vec (#22374) (camc314) - 5f3bdd0 codegen: Add `#[inline]` to `code`, `code_len` (#22373) (camc314) Co-authored-by: overlookmotel <557937+overlookmotel@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TriviaBuilder::handle_tokenis called byLexer::finish_tokenper token. The original body mixed two paths:previous_kind,saw_newline,saw_newline_for_comment).LLVM wasn't inlining the function — it appeared as a separate symbol at 1.46% parse self-time in xctrace Time Profiler. Splitting the for-loop into a
#[cold]helper and marking the wrapper#[inline]lets the hot path fuse intofinish_token's body and keeps the cold attachment loop out of the inlined code.No behavior change: the same fields are written and the same comments attached in the same order. test262 conformance unchanged: 47086/47086 positive, 4588/4588 negative.
Measurements
Wall-time on parse-only loop (parse N times, reusing allocator),
--profile release-with-debug, median of 3 runs:Modest but consistent across the larger inputs. Best on TSX (JSX brings more tokens per byte of source, so more
handle_tokencalls).🤖 Generated with Claude Code