Skip to content

perf(parser): split TriviaBuilder::handle_token hot/cold paths#22415

Merged
graphite-app[bot] merged 1 commit into
mainfrom
perf/parser-trivia-handle-token-hot-cold
May 14, 2026
Merged

perf(parser): split TriviaBuilder::handle_token hot/cold paths#22415
graphite-app[bot] merged 1 commit into
mainfrom
perf/parser-trivia-handle-token-hot-cold

Conversation

@Boshen

@Boshen Boshen commented May 14, 2026

Copy link
Copy Markdown
Member

Summary

TriviaBuilder::handle_token is called by Lexer::finish_token per token. The original body mixed two paths:

  • Hot path — three field writes (previous_kind, saw_newline, saw_newline_for_comment).
  • Cold path — a for-loop that retroactively marks any pending comments as leading comments of the current token. Only fires when there are unprocessed comments before a token, which is rare on a per-token basis.

LLVM wasn't inlining the function — it appeared as a separate symbol at 1.46% parse self-time in xctrace Time Profiler. Splitting the for-loop into a #[cold] helper and marking the wrapper #[inline] lets the hot path fuse into finish_token's body and keeps the cold attachment loop out of the inlined code.

No behavior change: the same fields are written and the same comments attached in the same order. test262 conformance unchanged: 47086/47086 positive, 4588/4588 negative.

Measurements

Wall-time on parse-only loop (parse N times, reusing allocator), --profile release-with-debug, median of 3 runs:

file before after delta
cal.com.tsx (1.0M) 3.97 ms/iter 3.80 ms/iter -4.3%
checker.ts (2.8M) 7.69 ms/iter 7.56 ms/iter -1.7%
antd.js (6.7M) 15.46 ms/iter 15.23 ms/iter -1.5%
pdf.mjs (554K) 2.27 ms/iter 2.27 ms/iter noise
binder.ts (189K) 418 µs/iter 420 µs/iter noise

Modest but consistent across the larger inputs. Best on TSX (JSX brings more tokens per byte of source, so more handle_token calls).

🤖 Generated with Claude Code

@github-actions github-actions Bot added the A-parser Area - Parser label May 14, 2026
@codspeed-hq

codspeed-hq Bot commented May 14, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 48 untouched benchmarks
⏩ 3 skipped benchmarks1


Comparing perf/parser-trivia-handle-token-hot-cold (03a1071) with main (fb4d98b)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Boshen Boshen added the 0-merge Merge with Graphite Merge Queue label May 14, 2026

Boshen commented May 14, 2026

Copy link
Copy Markdown
Member Author

Merge activity

## Summary

`TriviaBuilder::handle_token` is called by `Lexer::finish_token` per token. The original body mixed two paths:

- **Hot path** — three field writes (`previous_kind`, `saw_newline`, `saw_newline_for_comment`).
- **Cold path** — a for-loop that retroactively marks any pending comments as leading comments of the current token. Only fires when there are unprocessed comments before a token, which is rare on a per-token basis.

LLVM wasn't inlining the function — it appeared as a separate symbol at 1.46% parse self-time in xctrace Time Profiler. Splitting the for-loop into a `#[cold]` helper and marking the wrapper `#[inline]` lets the hot path fuse into `finish_token`'s body and keeps the cold attachment loop out of the inlined code.

No behavior change: the same fields are written and the same comments attached in the same order. test262 conformance unchanged: 47086/47086 positive, 4588/4588 negative.

## Measurements

Wall-time on parse-only loop (parse N times, reusing allocator), `--profile release-with-debug`, median of 3 runs:

| file | before | after | delta |
|---|---|---|---|
| cal.com.tsx (1.0M) | 3.97 ms/iter | 3.80 ms/iter | **-4.3%** |
| checker.ts (2.8M) | 7.69 ms/iter | 7.56 ms/iter | -1.7% |
| antd.js (6.7M) | 15.46 ms/iter | 15.23 ms/iter | -1.5% |
| pdf.mjs (554K) | 2.27 ms/iter | 2.27 ms/iter | noise |
| binder.ts (189K) | 418 µs/iter | 420 µs/iter | noise |

Modest but consistent across the larger inputs. Best on TSX (JSX brings more tokens per byte of source, so more `handle_token` calls).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@graphite-app graphite-app Bot force-pushed the perf/parser-trivia-handle-token-hot-cold branch from 03a1071 to 83679ea Compare May 14, 2026 09:30
@graphite-app graphite-app Bot merged commit 83679ea into main May 14, 2026
29 checks passed
@graphite-app graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label May 14, 2026
@graphite-app graphite-app Bot deleted the perf/parser-trivia-handle-token-hot-cold branch May 14, 2026 09:35
overlookmotel added a commit that referenced this pull request May 15, 2026
### 🚀 Features

- bc91a17 codegen: Expose `Codegen::with_source_type` method (#22432)
(camc314)

### 🐛 Bug Fixes

- 5ac7e79 minifier: Drop unused-var-init pure IIFEs and preserve
annotation for downstream (#22349) (Dunqing)
- 4ab57eb allocator: Fixed-size allocators use `VirtualAlloc` on Windows
(#22124) (overlookmotel)
- 66d77eb allocator: Fix segfault on Linux MUSL with fixed-size
allocators (#22388) (overlookmotel)
- b8fbc1f transformer/object-rest-spread: Correct scope id when moving
bindings (#22419) (camc314)
- 18edc2c codegen: Keep `Object.defineProperty` property name as plain
string in minify (#22400) (Dunqing)
- dda33de transformer/explicit-resource-management: Align lexical
binding scopes (#22320) (camc314)
- 8e79de8 transformer: Preserve for-await statement bodies (#22361)
(camc314)
- 0cba210 transformer/class: Replace `new.target` in static blocks
(#22360) (camc314)
- 67ab1c9 transformer/es2018/for-await: Hoist for-await generated
bindings (#22355) (camc314)
- c3ceb4a transformer/object-rest-spread: Use hoisted scope for `for-of`
temp refs (#22347) (camc314)

### ⚡ Performance

- 73a9043 allocator/bitset: Avoid temp heap `String` allocation (#22403)
(camc314)
- 8b2f4f9 transformer/object-rest-spread: Collect `Vec<SymbolId` over
`Vec<BindingIdentifier>` (#22418) (camc314)
- 83679ea parser: Split TriviaBuilder::handle_token hot/cold paths
(#22415) (Boshen)
- 2c7d781 codegen: Inline identifier-name accessors (#22411) (Boshen)
- 618bc76 diagnostics: Inline `OxcDiagnosticInner` to avoid heap
allocation (#22406) (Boshen)
- 0b4e158 parser: Reserve cap `2` for sequence expressions vec (#22374)
(camc314)
- 5f3bdd0 codegen: Add `#[inline]` to `code`, `code_len` (#22373)
(camc314)

Co-authored-by: overlookmotel <557937+overlookmotel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-parser Area - Parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant