perf(parser): use peek_token instead of checkpoint/rewind for single-token decisions by Boshen · Pull Request #23056 · oxc-project/oxc

Boshen · 2026-06-07T03:36:33Z

What

Two related changes:

perf(parser): use peek_token instead of checkpoint/rewind for single-token decisions — replace the checkpoint + bump + rewind pattern with a single cached lexer-level peek_token() in four hot statement-start paths where the parse decision depends only on the one token following the keyword:
- parse_let — token after let
- parse_import_statement — import.meta / import() vs import declaration
- parse_async_statement — async function vs other
- for (let … — token after let
fix(parser): dedup irregular whitespaces recorded during lookahead — prerequisite for the above (see below).

Why

In each path the keyword was speculatively consumed only to inspect the next token, then rewound on the cold path. By peeking first and only bump-ing once we commit, the keyword is never speculatively consumed, so no rewind is needed. peek_token() is a cached lexer-level re-lex and is far cheaper than a parser-level checkpoint, which snapshots lexer state, the current token, the error position and the fatal-error slot.

This continues the existing migration of single-token lookaheads from lookahead/checkpoint to peek_token (see the note in modifiers.rs).

The dedup fix

peek_token/lookahead lex past the current position and record any irregular whitespace they scan into trivia_builder.irregular_whitespaces; after rewinding, the committed path re-lexes the same span and records it again, producing duplicate no-irregular-whitespace diagnostics (e.g. let<NBSP>x = 1).

add_comment already handles this exact rewind-duplicate case for comments by skipping re-inserts on an ordered vec (start <= last.start). The same guard is now applied to add_irregular_whitespace. This also fixes the latent duplicate for the pre-existing lookahead-based paths, and correctly keeps genuinely-distinct adjacent whitespaces (let<NBSP><NBSP>x still reports two).

Conformance

No AST change. Verified by panic-isolated, per-suite comparison against main:

estree (full AST + spans + token streams): byte-identical
transformer: byte-identical

Irregular-whitespace counts verified directly: let<NBSP>x = 1, for (let<NBSP>x of y){}, async<NBSP>function f(){} each report exactly one span (was two); let<NBSP><NBSP>x reports two.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d903c49f08

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

codspeed-hq · 2026-06-07T03:43:57Z

Merging this PR will not alter performance

✅ 57 untouched benchmarks
⏩ 9 skipped benchmarks¹

_{Comparing perf/parser-peek-instead-of-checkpoint (1e0d3d0) with main (37169ff)²}

9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
No successful run was found on main (dc0e174) during the generation of this report, so 37169ff was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

Boshen · 2026-06-07T05:14:27Z

Merge activity

Jun 7, 5:14 AM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
Jun 7, 5:14 AM UTC: Boshen added this pull request to the Graphite merge queue.
Jun 7, 5:21 AM UTC: Merged by the Graphite merge queue.

…token decisions (#23056) ## What Two related changes: 1. **perf(parser): use `peek_token` instead of checkpoint/rewind for single-token decisions** — replace the `checkpoint` + `bump` + `rewind` pattern with a single cached lexer-level `peek_token()` in four hot statement-start paths where the parse decision depends only on the one token following the keyword: - `parse_let` — token after `let` - `parse_import_statement` — `import.meta` / `import()` vs import declaration - `parse_async_statement` — `async function` vs other - `for (let …` — token after `let` 2. **fix(parser): dedup irregular whitespaces recorded during lookahead** — prerequisite for the above (see below). ## Why In each path the keyword was speculatively consumed only to inspect the next token, then rewound on the cold path. By peeking first and only `bump`-ing once we commit, the keyword is never speculatively consumed, so no rewind is needed. `peek_token()` is a cached lexer-level re-lex and is far cheaper than a parser-level `checkpoint`, which snapshots lexer state, the current token, the error position and the fatal-error slot. This continues the existing migration of single-token lookaheads from `lookahead`/`checkpoint` to `peek_token` (see the note in `modifiers.rs`). ## The dedup fix `peek_token`/`lookahead` lex past the current position and record any irregular whitespace they scan into `trivia_builder.irregular_whitespaces`; after rewinding, the committed path re-lexes the same span and records it again, producing duplicate `no-irregular-whitespace` diagnostics (e.g. `let<NBSP>x = 1`). `add_comment` already handles this exact rewind-duplicate case for comments by skipping re-inserts on an ordered vec (`start <= last.start`). The same guard is now applied to `add_irregular_whitespace`. This also fixes the latent duplicate for the pre-existing `lookahead`-based paths, and correctly keeps genuinely-distinct adjacent whitespaces (`let<NBSP><NBSP>x` still reports two). ## Conformance No AST change. Verified by panic-isolated, per-suite comparison against `main`: - estree (full AST + spans + token streams): byte-identical - transformer: byte-identical Irregular-whitespace counts verified directly: `let<NBSP>x = 1`, `for (let<NBSP>x of y){}`, `async<NBSP>function f(){}` each report exactly one span (was two); `let<NBSP><NBSP>x` reports two.

) ## What Follow-up to #23056. Replace the parser-level `lookahead` (checkpoint + bump + rewind) in the TS `asserts` type-predicate path with a single cached lexer-level `peek_token()`. The check only inspects the one token after `asserts`: ```rust // before if self.lookahead(|parser| { parser.bump(Kind::Asserts); parser.is_token_identifier_or_keyword_on_same_line() }) { ... } // after let next = self.lexer.peek_token(); if next.kind().is_identifier_name() && !next.is_on_new_line() { ... } ``` The now-unused `is_token_identifier_or_keyword_on_same_line` helper is removed. ## Why `peek_token()` is a cached lexer-level re-lex, far cheaper than a parser-level `checkpoint` (which snapshots lexer state, the current token, the error position and the fatal-error slot). Same single-token migration as the four paths in #23056. ## Conformance No AST change — estree (full AST + spans + token streams) byte-identical to `main`.

### 💥 BREAKING CHANGES - ee4dc73 ast: [**BREAKING**] Add `#[non_exhaustive]` to AST nodes (#23046) (overlookmotel) - 4c35362 ast: [**BREAKING**] Add `AstBuilder::template_element_escape_raw` and `template_element_escape_raw_with_lone_surrogates` methods (#23047) (overlookmotel) ### 🚀 Features - b846ab2 react_compiler: Integrate the Rust port of the React Compiler (#22942) (Boshen) - 5b8dd68 parser: Report TS1255 for invalid class definite assertions (#22917) (camc314) - 85efabf semantic: Make building the class table optional, off by default (#22862) (Boshen) ### 🐛 Bug Fixes - 556acdc codegen: Parenthesize TS-cast assignment targets (#23112) (Boshen) - 37169ff codegen: Don't emit space between postfix `--` and `>` when minifying (#23036) (Boshen) - a4b1bf7 codegen: Drop redundant whitespace in minified TypeScript output (#23038) (Boshen) - cf53285 parser: Report reserved type-declaration names in the parser (#23035) (Boshen) - 4e44969 ast: Fix UB in `escape_template_element_raw` (#23052) (overlookmotel) - c543154 parser: Report comma operator in JSX expression in the parser (#23030) (Boshen) - 325c94f codegen: Tighten conditional-type and constructor-type whitespace when minifying (#23033) (Boshen) - 95dd3a2 parser: Report `import type` alias to a non-external reference in the parser (#23032) (Boshen) - 90180b8 codegen: Drop space after `:` in function return type when minifying (#23028) (Boshen) - 6da876e parser: Report `abstract` private class field in the parser (#23029) (Boshen) - 28467ce codegen: Don't emit space before a postfix update operand when minifying (#23027) (Boshen) - cb29926 codegen: Drop redundant space after `export default` when minifying (#23024) (Boshen) - 62965ae codegen: Drop redundant space after `else` when minifying (#23025) (Boshen) - 989230a parser: Report compound assignment to non-simple target in the parser (#23022) (Boshen) - 06f367c parser: Report `super.#field` private access in the parser (#23014) (Boshen) - 184edef codegen: Print space before `const`/`declare` enum modifier (#23013) (Boshen) - 4d722e0 parser: Report duplicate switch `default` clause in the parser (#23012) (Boshen) - 597ed85 codegen: Parenthesize `let`/`async` for-of head target (#23008) (Boshen) - 8b631bf codegen: Remove stray space before mapped type value colon (#23010) (Boshen) - c08407e codegen: Don't over-parenthesize `in` inside an arrow in a for-init (#23009) (Boshen) - 600cd6f codegen: Parenthesize lower-precedence `TSInstantiationExpression` operand (#23007) (Boshen) - 187e1a5 codegen: Don't leak space after comment-only JSX expression container (#23006) (Boshen) - 294c473 codegen: Don't over-parenthesize `TSTypeAssertion` operand (#23004) (Boshen) - 786d96f codegen: Give `TSTypeAssertion` unary precedence (#23002) (Boshen) - 1295882 parser: Report `new.target` and `import.meta` syntax errors in the parser (#23003) (Boshen) - d727b6b codegen: Parenthesize `await` expression as base of `**` (#23001) (Boshen) - 67dfa08 codegen: Keep parentheses around `new` callees containing a call (#22997) (Boshen) - 17e7cf3 parser: Disallow unerasable `as`/`satisfies` assertions (#22986) (Boshen) - beb46d3 parser: Commit to module goal on decorated exports (#22941) (Boshen) - 49e63f7 isolated-declarations: Require annotations for satisfies initializers (#22898) (camc314) - 8c93601 isolated-declarations: Allow unknown enum initializer in non-const enum (#22900) (camc314) ### ⚡ Performance - 7d89909 parser: Peek instead of lookahead for yield disambiguation (#23071) (Boshen) - bf872f0 parser: Skip arrow lookahead for a parenthesized literal (#23070) (Boshen) - d19fc54 parser: Guard type-argument speculation behind an angle-token check (#23069) (Boshen) - 8eb5507 parser: Skip redundant member-rest re-scan on call entry (#23068) (Boshen) - 883dfc1 parser: Skip parse_call_expression_rest when no call follows (#23063) (Boshen) - b171153 parser: Peek before the await-using lookahead (#23059) (Boshen) - 56f21bd parser: Use peek_token for the TS `asserts` type predicate (#23058) (Boshen) - 68805ac parser: Use peek_token instead of checkpoint/rewind for single-token decisions (#23056) (Boshen) - 1f9d8eb ast: `AstBuilder::template_element_escape_raw` avoid allocation if no escape required (#23053) (overlookmotel) - 502b04d semantic: Move cold function redeclaration handling into `#[cold]` function (#22973) (overlookmotel) ### 📚 Documentation - 275d318 napi/minifier: Point `target` to oxc docs (#23102) (camc314) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>

github-actions Bot added the A-parser Area - Parser label Jun 7, 2026

chatgpt-codex-connector Bot reviewed Jun 7, 2026

View reviewed changes

Comment thread crates/oxc_parser/src/js/declaration.rs

Boshen force-pushed the perf/parser-peek-instead-of-checkpoint branch from 75ff726 to 1e0d3d0 Compare June 7, 2026 05:06

Boshen added the 0-merge Merge with Graphite Merge Queue label Jun 7, 2026

graphite-app Bot force-pushed the perf/parser-peek-instead-of-checkpoint branch from 1e0d3d0 to 68805ac Compare June 7, 2026 05:18

graphite-app Bot merged commit 68805ac into main Jun 7, 2026
30 checks passed

graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label Jun 7, 2026

graphite-app Bot deleted the perf/parser-peek-instead-of-checkpoint branch June 7, 2026 05:21

Boshen mentioned this pull request Jun 7, 2026

perf(parser): use peek_token for the TS asserts type predicate #23058

Merged

oxc-guard Bot mentioned this pull request Jun 8, 2026

release(crates): oxc v0.135.0 #23117

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(parser): use peek_token instead of checkpoint/rewind for single-token decisions#23056

perf(parser): use peek_token instead of checkpoint/rewind for single-token decisions#23056
graphite-app[bot] merged 1 commit into
mainfrom
perf/parser-peek-instead-of-checkpoint

Boshen commented Jun 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

codspeed-hq Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

Boshen commented Jun 7, 2026 •

edited by graphite-app Bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Boshen commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

The dedup fix

Conformance

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

codspeed-hq Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

Boshen commented Jun 7, 2026 • edited by graphite-app Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Boshen commented Jun 7, 2026 •

edited

Loading

codspeed-hq Bot commented Jun 7, 2026 •

edited

Loading

Boshen commented Jun 7, 2026 •

edited by graphite-app Bot

Loading