fix(codegen): preserve verbatim text of pure/no-side-effects comments#22525
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Merging this PR will not alter performance
Comparing Footnotes
|
9b1859f to
82bb185
Compare
|
@copilot resolve the merge conflicts in this pull request |
Merge activity
|
…#22525) ## Summary - Stash `@__PURE__` / `@__NO_SIDE_EFFECTS__` (and `#`-prefixed) comments by `attached_to` during comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (`/* @__PURE__ */`, `/* @__NO_SIDE_EFFECTS__ */`) only when source text is unavailable or the stashed comment's kind doesn't match the emission site. - Inputs like `/* #__PURE__ */`, `/*#__PURE__*/`, and `/* #__PURE__ -- @preserve */` round-trip verbatim instead of being normalized to `/* @__PURE__ */` ([rolldown#9408](rolldown/rolldown#9408)). - Verified by `cargo test -p oxc_codegen` (105 tests; includes new regression for mixed-kind `attached_to` collision in `pure_comment`). ## Problem Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST `pure` flag. This lost the original text: - `/* #__PURE__ */ foo()` → `/* @__PURE__ */ foo()` — bundlers / tools that prefer the `#` form lost it. - `/*#__PURE__*/ foo()` (no spaces) was expanded to `/* @__PURE__ */ foo()`. - `/* #__PURE__ -- @preserve */ foo()` lost the trailing `-- @preserve` marker that some tools key on (rolldown#9408). ## Fix A new `annotation_comments: FxHashMap<u32, Comment>` stashes `is_pure()` / `is_no_side_effects()` comments keyed by `attached_to`. `Codegen::print_annotation_comment(start, kind, newline_after)` emits the verbatim source when available, otherwise prints the canonical literal owned by `AnnotationKind::canonical(newline_after)`. Why a separate map from the existing `comments` (legal / JSDoc / not-applied / normal): applied pure annotations must be emitted **inline at the `pure` site** with a trailing space — not as a leading comment with a newline — and only when `node.pure == true`. Sharing storage would either double-print or conflate two formatting modes inside `print_comments`. Comments where the parser couldn't apply the annotation (e.g. `/* #__PURE__ */ function foo() {}` — wrong target; classified as `CommentContent::PureNotApplied`) still flow through `comments` and the existing leading-comment behavior is preserved. ### Mixed-kind collision When multiple annotation comments share an `attached_to` (e.g. `/* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()`), the map is last-write-wins. The `AnnotationKind` filter on lookup ensures a `@__NO_SIDE_EFFECTS__` slot can't be emitted in place of `@__PURE__` (or vice versa); the dropped comment falls back to the canonical literal. Per the [compiler-notations-spec](https://github.com/javascript-compiler-hints/compiler-notations-spec), the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of a `CallExpression` would be a real semantic regression. ## Before / after ```js // input /* #__PURE__ -- @preserve */ pureOperation(); /*#__PURE__*/ foo(bar); ``` ```js // before /* @__PURE__ */ pureOperation(); /* @__PURE__ */ foo(bar); // after /* #__PURE__ -- @preserve */ pureOperation(); /*#__PURE__*/ foo(bar); ``` AI assisted.
438b9c7 to
9be0071
Compare
…#22525) ## Summary - Stash `@__PURE__` / `@__NO_SIDE_EFFECTS__` (and `#`-prefixed) comments by `attached_to` during comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (`/* @__PURE__ */`, `/* @__NO_SIDE_EFFECTS__ */`) only when source text is unavailable or the stashed comment's kind doesn't match the emission site. - Inputs like `/* #__PURE__ */`, `/*#__PURE__*/`, and `/* #__PURE__ -- @preserve */` round-trip verbatim instead of being normalized to `/* @__PURE__ */` ([rolldown#9408](rolldown/rolldown#9408)). - Verified by `cargo test -p oxc_codegen` (105 tests; includes new regression for mixed-kind `attached_to` collision in `pure_comment`). ## Problem Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST `pure` flag. This lost the original text: - `/* #__PURE__ */ foo()` → `/* @__PURE__ */ foo()` — bundlers / tools that prefer the `#` form lost it. - `/*#__PURE__*/ foo()` (no spaces) was expanded to `/* @__PURE__ */ foo()`. - `/* #__PURE__ -- @preserve */ foo()` lost the trailing `-- @preserve` marker that some tools key on (rolldown#9408). ## Fix A new `annotation_comments: FxHashMap<u32, Comment>` stashes `is_pure()` / `is_no_side_effects()` comments keyed by `attached_to`. `Codegen::print_annotation_comment(start, kind, newline_after)` emits the verbatim source when available, otherwise prints the canonical literal owned by `AnnotationKind::canonical(newline_after)`. Why a separate map from the existing `comments` (legal / JSDoc / not-applied / normal): applied pure annotations must be emitted **inline at the `pure` site** with a trailing space — not as a leading comment with a newline — and only when `node.pure == true`. Sharing storage would either double-print or conflate two formatting modes inside `print_comments`. Comments where the parser couldn't apply the annotation (e.g. `/* #__PURE__ */ function foo() {}` — wrong target; classified as `CommentContent::PureNotApplied`) still flow through `comments` and the existing leading-comment behavior is preserved. ### Mixed-kind collision When multiple annotation comments share an `attached_to` (e.g. `/* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()`), the map is last-write-wins. The `AnnotationKind` filter on lookup ensures a `@__NO_SIDE_EFFECTS__` slot can't be emitted in place of `@__PURE__` (or vice versa); the dropped comment falls back to the canonical literal. Per the [compiler-notations-spec](https://github.com/javascript-compiler-hints/compiler-notations-spec), the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of a `CallExpression` would be a real semantic regression. ## Before / after ```js // input /* #__PURE__ -- @preserve */ pureOperation(); /*#__PURE__*/ foo(bar); ``` ```js // before /* @__PURE__ */ pureOperation(); /* @__PURE__ */ foo(bar); // after /* #__PURE__ -- @preserve */ pureOperation(); /*#__PURE__*/ foo(bar); ``` AI assisted.
9be0071 to
d61e1d7
Compare
### 🚀 Features - e857b0c napi/minify: Expose legalComments option and result (#20370) (Boshen) - 661132d parser: More friendly error messages for rest assignment target and rest binding element (#22719) (sapphi-red) - ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option for nullable-union design:type (#22266) (Kyle Cannon) ### 🐛 Bug Fixes - e1d064e transformer/class-properties: Reparent lifted private method helpers (#22716) (Cameron) - 4ac0fca minifier: Preserve `0 && (module.exports = { ... })` cjs-module-lexer hint (#22729) (Dunqing) - 40ff611 minifier: Mark peephole loop changed when dropping dead-after-throw statement (#22722) (Dunqing) - 2f7b210 codegen: Emit pife-arrow/function leading comments inside the wrap (#22720) (Dunqing) - e184f74 parser: Improve invalid `import` property access diagnostic (#22693) (camc314) - 7baed9c transformer/private-method: Clear inherited strict flags (#22508) (camc314) - a9ad27e parser: Keep annotation comments leading without preceding newline (#22711) (Dunqing) - 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after peephole inlining (#22595) (Dunqing) - 07afbb6 minifier: Drop empty-body IIFE wrapper when called with arguments (#22589) (Dunqing) - fa7c463 semantic: Correct TS enum member symbol spans (#22689) (camc314) - 26b9396 semantic: Resolve parameter decorators outside parameter scope (#22623) (camc314) - b284045 parser: Switch to module goal eagerly on `export` (#22684) (Boshen) - dfa931d semantic: Propagate unresolved auto-increment enum value instead of defaulting to 0 (#22646) (Dunqing) - 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T> in decorator metadata (#22265) (Kyle Cannon) - e421ef0 transformer/legacy-decorator: Return runtime binding for design:type (#22640) (Dunqing) - d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects comments (#22525) (Dunqing) - 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547) (Dunqing) - 917da24 parser: Apply PURE comment through member-access chains (#22566) (Dunqing) - a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings (#22551) (Dunqing) ### ⚡ Performance - 2f623b0 semantic: Skip unresolved checks for re-exports (#22660) (camc314) - 0d9553d semantic: Early-exit `check_object_expression` for objects with <2 properties (#22668) (Dunqing) - d721ad9 semantic: Use direct grandparent lookup for TS type parameters (#22658) (camc314) - 0aff288 semantic: Reorder numeric literal strict mode checks (#22657) (camc314) - 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314) - e32acd8 semantic: Reorder identifier ambient binding check (#22653) (camc314) - 09fe178 semantic: Reorder ident reference strict mode check (#22652) (camc314) - 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663) (camc314) - 82f9662 parser: Check identifier kind before context flag (#22662) (camc314) - d7cd951 parser: Fast path identifier parsing and inline operator helpers (#22650) (Boshen) - 7b84314 semantic: Use direct byte access for numeric leading-zero check (#22642) (camc314) - 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314) - 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596) (Dunqing) - 4f289f1 semantic: Resolve_references_for_current_scope without a temp Vec (#22599) (Dunqing) - e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603) (Dunqing) - 8ff8674 semantic: Early return if `excess` is `0` in `Stats::increase_by` (#22616) (camc314) - 7a4120e semantic: Pre-reserve unresolved_references using Stats::references (#22580) (Dunqing) Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>
## Summary - Bump oxc Rust crates and `@oxc-project/*` / `oxc-*` npm packages from `0.132.0` to `0.133.0`. ~~Bump `oxc_resolver` / `oxc_resolver_napi` from `11.19.1` to `11.19.2`.~~ - Regenerate `embedded_helpers.rs` to point at `@oxc-project+runtime@0.133.0` (version string only — no helper-content drift). - Handle two new oxc fields surfaced by `cargo check`: - `oxc::transformer::DecoratorOptions.strict_null_checks` — hardcoded to `true` in the `From` impl (matches oxc's default). Exposing it through rolldown's wrapper would be a separate API change. - `oxc_minify_napi::CodegenOptions.legal_comments` — set to `None`. - Extend `DecoratorOptionSchema` and `CodegenOptionsSchema` in `validator.ts` to mirror the new optional fields. - Refresh 32 snapshots driven by upstream oxc PR [#22547](oxc-project/oxc#22547) ("fix(minifier): preserve IIFE structure in DCE-only mode"). In DCE-only mode (rolldown's per-module preprocess via `Compressor::dead_code_elimination_with_scoping`), oxc no longer inlines `(() => …)()` IIFEs that carry — or could carry — `/*#__PURE__*/` annotations, so downstream tree-shaking can see them. Empty-IIFE elision still fires. Side effects: `property_read_side_effects` now correctly tree-shakes `/*#__PURE__*/ test().a.b.c;` lines and drops the previously spurious `INVALID_ANNOTATION` warnings; TS class `(() => new Foo())()` patterns and `Symbol((() => Math.random() < 0.5)() ? …)` calls are preserved as the sources actually wrote them; `dce_of_iife/diff.md` divergence vs esbuild shrunk. ## Issues fixed Pulled in from upstream oxc fixes between 0.132.0 and 0.133.0: - Fixes #9437 — Pure-annotated IIFE returning an array is inlined (via oxc [#22547](oxc-project/oxc#22547)) - Fixes #9494 — Cross-module enum inlining mis-folds auto-increment member to 0 (via oxc [#22646](oxc-project/oxc#22646)) - Fixes #9408 — Rolldown drops `@preserve` from `/* #__PURE__ -- @preserve */` comments (via oxc [#22525](oxc-project/oxc#22525) "preserve verbatim text of pure/no-side-effects comments") - Partially addresses #8688 — `dce/dce_of_iife` esbuild divergence shrunk (via oxc [#22589](oxc-project/oxc#22589) "drop empty-body IIFE wrapper when called with arguments") Also relevant: oxc [#22566](oxc-project/oxc#22566) reduces false-positive `PureNotApplied` warnings (the warnings system landed in rolldown #9381) by applying PURE through member-access chains. ## Test plan - [x] `cargo check --workspace --all-targets` - [x] `just update-generated-code` - [x] `just test-update` - [x] `just ued` - [x] `just roll` — 32 Rust suites pass (0 failed), 1212 node tests pass (0 failed), TS lint+types clean on 804 files --------- Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>

Summary
@__PURE__/@__NO_SIDE_EFFECTS__(and#-prefixed) comments byattached_toduring comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (/* @__PURE__ */,/* @__NO_SIDE_EFFECTS__ */) only when source text is unavailable or the stashed comment's kind doesn't match the emission site./* #__PURE__ */,/*#__PURE__*/, and/* #__PURE__ -- @preserve */round-trip verbatim instead of being normalized to/* @__PURE__ */(rolldown#9408).cargo test -p oxc_codegen(105 tests; includes new regression for mixed-kindattached_tocollision inpure_comment).Problem
Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST
pureflag. This lost the original text:/* #__PURE__ */ foo()→/* @__PURE__ */ foo()— bundlers / tools that prefer the#form lost it./*#__PURE__*/ foo()(no spaces) was expanded to/* @__PURE__ */ foo()./* #__PURE__ -- @preserve */ foo()lost the trailing-- @preservemarker that some tools key on (rolldown#9408).Fix
A new
annotation_comments: FxHashMap<u32, Comment>stashesis_pure()/is_no_side_effects()comments keyed byattached_to.Codegen::print_annotation_comment(start, kind, newline_after)emits the verbatim source when available, otherwise prints the canonical literal owned byAnnotationKind::canonical(newline_after).Why a separate map from the existing
comments(legal / JSDoc / not-applied / normal): applied pure annotations must be emitted inline at thepuresite with a trailing space — not as a leading comment with a newline — and only whennode.pure == true. Sharing storage would either double-print or conflate two formatting modes insideprint_comments. Comments where the parser couldn't apply the annotation (e.g./* #__PURE__ */ function foo() {}— wrong target; classified asCommentContent::PureNotApplied) still flow throughcommentsand the existing leading-comment behavior is preserved.Mixed-kind collision
When multiple annotation comments share an
attached_to(e.g./* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()), the map is last-write-wins. TheAnnotationKindfilter on lookup ensures a@__NO_SIDE_EFFECTS__slot can't be emitted in place of@__PURE__(or vice versa); the dropped comment falls back to the canonical literal. Per the compiler-notations-spec, the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of aCallExpressionwould be a real semantic regression.Before / after
AI assisted.