Skip to content

fix(codegen): preserve verbatim text of pure/no-side-effects comments#22525

Merged
graphite-app[bot] merged 1 commit into
mainfrom
fix/codegen-preserve-verbatim-pure-comment
May 19, 2026
Merged

fix(codegen): preserve verbatim text of pure/no-side-effects comments#22525
graphite-app[bot] merged 1 commit into
mainfrom
fix/codegen-preserve-verbatim-pure-comment

Conversation

@Dunqing

@Dunqing Dunqing commented May 18, 2026

Copy link
Copy Markdown
Member

Summary

  • Stash @__PURE__ / @__NO_SIDE_EFFECTS__ (and #-prefixed) comments by attached_to during comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (/* @__PURE__ */, /* @__NO_SIDE_EFFECTS__ */) only when source text is unavailable or the stashed comment's kind doesn't match the emission site.
  • Inputs like /* #__PURE__ */, /*#__PURE__*/, and /* #__PURE__ -- @preserve */ round-trip verbatim instead of being normalized to /* @__PURE__ */ (rolldown#9408).
  • Verified by cargo test -p oxc_codegen (105 tests; includes new regression for mixed-kind attached_to collision in pure_comment).

Problem

Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST pure flag. This lost the original text:

  • /* #__PURE__ */ foo()/* @__PURE__ */ foo() — bundlers / tools that prefer the # form lost it.
  • /*#__PURE__*/ foo() (no spaces) was expanded to /* @__PURE__ */ foo().
  • /* #__PURE__ -- @preserve */ foo() lost the trailing -- @preserve marker that some tools key on (rolldown#9408).

Fix

A new annotation_comments: FxHashMap<u32, Comment> stashes is_pure() / is_no_side_effects() comments keyed by attached_to. Codegen::print_annotation_comment(start, kind, newline_after) emits the verbatim source when available, otherwise prints the canonical literal owned by AnnotationKind::canonical(newline_after).

Why a separate map from the existing comments (legal / JSDoc / not-applied / normal): applied pure annotations must be emitted inline at the pure site with a trailing space — not as a leading comment with a newline — and only when node.pure == true. Sharing storage would either double-print or conflate two formatting modes inside print_comments. Comments where the parser couldn't apply the annotation (e.g. /* #__PURE__ */ function foo() {} — wrong target; classified as CommentContent::PureNotApplied) still flow through comments and the existing leading-comment behavior is preserved.

Mixed-kind collision

When multiple annotation comments share an attached_to (e.g. /* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()), the map is last-write-wins. The AnnotationKind filter on lookup ensures a @__NO_SIDE_EFFECTS__ slot can't be emitted in place of @__PURE__ (or vice versa); the dropped comment falls back to the canonical literal. Per the compiler-notations-spec, the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of a CallExpression would be a real semantic regression.

Before / after

// input
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);
// before
/* @__PURE__ */ pureOperation();
/* @__PURE__ */ foo(bar);

// after
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);

AI assisted.

@github-actions github-actions Bot added the A-codegen Area - Code Generation label May 18, 2026

Dunqing commented May 18, 2026

Copy link
Copy Markdown
Member Author

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Dunqing Dunqing marked this pull request as ready for review May 18, 2026 09:45
@codspeed-hq

codspeed-hq Bot commented May 18, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 48 untouched benchmarks
⏩ 3 skipped benchmarks1


Comparing fix/codegen-preserve-verbatim-pure-comment (438b9c7) with main (6dba588)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Dunqing Dunqing force-pushed the fix/codegen-preserve-verbatim-pure-comment branch from 9b1859f to 82bb185 Compare May 18, 2026 10:20
@Dunqing Dunqing requested a review from Boshen May 18, 2026 13:43
@Dunqing

Dunqing commented May 19, 2026

Copy link
Copy Markdown
Member Author

@copilot resolve the merge conflicts in this pull request

Copilot AI commented May 19, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved. The only conflict was in crates/oxc_codegen/src/gen.rs — the use crate::{...} block needed both comment::AnnotationKind (from this PR) and cjs_module_lexer (from main). Combined both imports and merged in 438b9c7.

@Dunqing Dunqing added the 0-merge Merge with Graphite Merge Queue label May 19, 2026

Dunqing commented May 19, 2026

Copy link
Copy Markdown
Member Author

Merge activity

graphite-app Bot pushed a commit that referenced this pull request May 19, 2026
…#22525)

## Summary

- Stash `@__PURE__` / `@__NO_SIDE_EFFECTS__` (and `#`-prefixed) comments by `attached_to` during comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (`/* @__PURE__ */`, `/* @__NO_SIDE_EFFECTS__ */`) only when source text is unavailable or the stashed comment's kind doesn't match the emission site.
- Inputs like `/* #__PURE__ */`, `/*#__PURE__*/`, and `/* #__PURE__ -- @preserve */` round-trip verbatim instead of being normalized to `/* @__PURE__ */` ([rolldown#9408](rolldown/rolldown#9408)).
- Verified by `cargo test -p oxc_codegen` (105 tests; includes new regression for mixed-kind `attached_to` collision in `pure_comment`).

## Problem

Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST `pure` flag. This lost the original text:

- `/* #__PURE__ */ foo()` → `/* @__PURE__ */ foo()` — bundlers / tools that prefer the `#` form lost it.
- `/*#__PURE__*/ foo()` (no spaces) was expanded to `/* @__PURE__ */ foo()`.
- `/* #__PURE__ -- @preserve */ foo()` lost the trailing `-- @preserve` marker that some tools key on (rolldown#9408).

## Fix

A new `annotation_comments: FxHashMap<u32, Comment>` stashes `is_pure()` / `is_no_side_effects()` comments keyed by `attached_to`. `Codegen::print_annotation_comment(start, kind, newline_after)` emits the verbatim source when available, otherwise prints the canonical literal owned by `AnnotationKind::canonical(newline_after)`.

Why a separate map from the existing `comments` (legal / JSDoc / not-applied / normal): applied pure annotations must be emitted **inline at the `pure` site** with a trailing space — not as a leading comment with a newline — and only when `node.pure == true`. Sharing storage would either double-print or conflate two formatting modes inside `print_comments`. Comments where the parser couldn't apply the annotation (e.g. `/* #__PURE__ */ function foo() {}` — wrong target; classified as `CommentContent::PureNotApplied`) still flow through `comments` and the existing leading-comment behavior is preserved.

### Mixed-kind collision

When multiple annotation comments share an `attached_to` (e.g. `/* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()`), the map is last-write-wins. The `AnnotationKind` filter on lookup ensures a `@__NO_SIDE_EFFECTS__` slot can't be emitted in place of `@__PURE__` (or vice versa); the dropped comment falls back to the canonical literal. Per the [compiler-notations-spec](https://github.com/javascript-compiler-hints/compiler-notations-spec), the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of a `CallExpression` would be a real semantic regression.

## Before / after

```js
// input
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);
```

```js
// before
/* @__PURE__ */ pureOperation();
/* @__PURE__ */ foo(bar);

// after
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);
```

AI assisted.
@graphite-app graphite-app Bot force-pushed the fix/codegen-preserve-verbatim-pure-comment branch from 438b9c7 to 9be0071 Compare May 19, 2026 10:10
…#22525)

## Summary

- Stash `@__PURE__` / `@__NO_SIDE_EFFECTS__` (and `#`-prefixed) comments by `attached_to` during comment build, then emit the original source text at call / new / function-decl sites. Falls back to the canonical literal (`/* @__PURE__ */`, `/* @__NO_SIDE_EFFECTS__ */`) only when source text is unavailable or the stashed comment's kind doesn't match the emission site.
- Inputs like `/* #__PURE__ */`, `/*#__PURE__*/`, and `/* #__PURE__ -- @preserve */` round-trip verbatim instead of being normalized to `/* @__PURE__ */` ([rolldown#9408](rolldown/rolldown#9408)).
- Verified by `cargo test -p oxc_codegen` (105 tests; includes new regression for mixed-kind `attached_to` collision in `pure_comment`).

## Problem

Codegen previously dropped pure / no-side-effects comments and emitted a canonical literal at every site marked with the AST `pure` flag. This lost the original text:

- `/* #__PURE__ */ foo()` → `/* @__PURE__ */ foo()` — bundlers / tools that prefer the `#` form lost it.
- `/*#__PURE__*/ foo()` (no spaces) was expanded to `/* @__PURE__ */ foo()`.
- `/* #__PURE__ -- @preserve */ foo()` lost the trailing `-- @preserve` marker that some tools key on (rolldown#9408).

## Fix

A new `annotation_comments: FxHashMap<u32, Comment>` stashes `is_pure()` / `is_no_side_effects()` comments keyed by `attached_to`. `Codegen::print_annotation_comment(start, kind, newline_after)` emits the verbatim source when available, otherwise prints the canonical literal owned by `AnnotationKind::canonical(newline_after)`.

Why a separate map from the existing `comments` (legal / JSDoc / not-applied / normal): applied pure annotations must be emitted **inline at the `pure` site** with a trailing space — not as a leading comment with a newline — and only when `node.pure == true`. Sharing storage would either double-print or conflate two formatting modes inside `print_comments`. Comments where the parser couldn't apply the annotation (e.g. `/* #__PURE__ */ function foo() {}` — wrong target; classified as `CommentContent::PureNotApplied`) still flow through `comments` and the existing leading-comment behavior is preserved.

### Mixed-kind collision

When multiple annotation comments share an `attached_to` (e.g. `/* @__PURE__ */ /* @__NO_SIDE_EFFECTS__ */ foo()`), the map is last-write-wins. The `AnnotationKind` filter on lookup ensures a `@__NO_SIDE_EFFECTS__` slot can't be emitted in place of `@__PURE__` (or vice versa); the dropped comment falls back to the canonical literal. Per the [compiler-notations-spec](https://github.com/javascript-compiler-hints/compiler-notations-spec), the two annotations only apply to their corresponding node kinds (call/new vs function declaration), so mixing them in front of a `CallExpression` would be a real semantic regression.

## Before / after

```js
// input
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);
```

```js
// before
/* @__PURE__ */ pureOperation();
/* @__PURE__ */ foo(bar);

// after
/* #__PURE__ -- @preserve */ pureOperation();
/*#__PURE__*/ foo(bar);
```

AI assisted.
@graphite-app graphite-app Bot force-pushed the fix/codegen-preserve-verbatim-pure-comment branch from 9be0071 to d61e1d7 Compare May 19, 2026 10:13
@graphite-app graphite-app Bot merged commit d61e1d7 into main May 19, 2026
29 checks passed
@graphite-app graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label May 19, 2026
@graphite-app graphite-app Bot deleted the fix/codegen-preserve-verbatim-pure-comment branch May 19, 2026 10:17
Dunqing added a commit that referenced this pull request May 26, 2026
### 🚀 Features

- e857b0c napi/minify: Expose legalComments option and result (#20370)
(Boshen)
- 661132d parser: More friendly error messages for rest assignment
target and rest binding element (#22719) (sapphi-red)
- ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option
for nullable-union design:type (#22266) (Kyle Cannon)

### 🐛 Bug Fixes

- e1d064e transformer/class-properties: Reparent lifted private method
helpers (#22716) (Cameron)
- 4ac0fca minifier: Preserve `0 && (module.exports = { ... })`
cjs-module-lexer hint (#22729) (Dunqing)
- 40ff611 minifier: Mark peephole loop changed when dropping
dead-after-throw statement (#22722) (Dunqing)
- 2f7b210 codegen: Emit pife-arrow/function leading comments inside the
wrap (#22720) (Dunqing)
- e184f74 parser: Improve invalid `import` property access diagnostic
(#22693) (camc314)
- 7baed9c transformer/private-method: Clear inherited strict flags
(#22508) (camc314)
- a9ad27e parser: Keep annotation comments leading without preceding
newline (#22711) (Dunqing)
- 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after
peephole inlining (#22595) (Dunqing)
- 07afbb6 minifier: Drop empty-body IIFE wrapper when called with
arguments (#22589) (Dunqing)
- fa7c463 semantic: Correct TS enum member symbol spans (#22689)
(camc314)
- 26b9396 semantic: Resolve parameter decorators outside parameter scope
(#22623) (camc314)
- b284045 parser: Switch to module goal eagerly on `export` (#22684)
(Boshen)
- dfa931d semantic: Propagate unresolved auto-increment enum value
instead of defaulting to 0 (#22646) (Dunqing)
- 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T>
in decorator metadata (#22265) (Kyle Cannon)
- e421ef0 transformer/legacy-decorator: Return runtime binding for
design:type (#22640) (Dunqing)
- d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects
comments (#22525) (Dunqing)
- 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547)
(Dunqing)
- 917da24 parser: Apply PURE comment through member-access chains
(#22566) (Dunqing)
- a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings
(#22551) (Dunqing)

### ⚡ Performance

- 2f623b0 semantic: Skip unresolved checks for re-exports (#22660)
(camc314)
- 0d9553d semantic: Early-exit `check_object_expression` for objects
with <2 properties (#22668) (Dunqing)
- d721ad9 semantic: Use direct grandparent lookup for TS type parameters
(#22658) (camc314)
- 0aff288 semantic: Reorder numeric literal strict mode checks (#22657)
(camc314)
- 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314)
- e32acd8 semantic: Reorder identifier ambient binding check (#22653)
(camc314)
- 09fe178 semantic: Reorder ident reference strict mode check (#22652)
(camc314)
- 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663)
(camc314)
- 82f9662 parser: Check identifier kind before context flag (#22662)
(camc314)
- d7cd951 parser: Fast path identifier parsing and inline operator
helpers (#22650) (Boshen)
- 7b84314 semantic: Use direct byte access for numeric leading-zero
check (#22642) (camc314)
- 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314)
- 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596)
(Dunqing)
- 4f289f1 semantic: Resolve_references_for_current_scope without a temp
Vec (#22599) (Dunqing)
- e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603)
(Dunqing)
- 8ff8674 semantic: Early return if `excess` is `0` in
`Stats::increase_by` (#22616) (camc314)
- 7a4120e semantic: Pre-reserve unresolved_references using
Stats::references (#22580) (Dunqing)

Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>
shulaoda added a commit to rolldown/rolldown that referenced this pull request May 27, 2026
## Summary

- Bump oxc Rust crates and `@oxc-project/*` / `oxc-*` npm packages from `0.132.0` to `0.133.0`. ~~Bump `oxc_resolver` / `oxc_resolver_napi` from `11.19.1` to `11.19.2`.~~
- Regenerate `embedded_helpers.rs` to point at `@oxc-project+runtime@0.133.0` (version string only — no helper-content drift).
- Handle two new oxc fields surfaced by `cargo check`:
  - `oxc::transformer::DecoratorOptions.strict_null_checks` — hardcoded to `true` in the `From` impl (matches oxc's default). Exposing it through rolldown's wrapper would be a separate API change.
  - `oxc_minify_napi::CodegenOptions.legal_comments` — set to `None`.
- Extend `DecoratorOptionSchema` and `CodegenOptionsSchema` in `validator.ts` to mirror the new optional fields.
- Refresh 32 snapshots driven by upstream oxc PR [#22547](oxc-project/oxc#22547) ("fix(minifier): preserve IIFE structure in DCE-only mode"). In DCE-only mode (rolldown's per-module preprocess via `Compressor::dead_code_elimination_with_scoping`), oxc no longer inlines `(() => …)()` IIFEs that carry — or could carry — `/*#__PURE__*/` annotations, so downstream tree-shaking can see them. Empty-IIFE elision still fires. Side effects: `property_read_side_effects` now correctly tree-shakes `/*#__PURE__*/ test().a.b.c;` lines and drops the previously spurious `INVALID_ANNOTATION` warnings; TS class `(() => new Foo())()` patterns and `Symbol((() => Math.random() < 0.5)() ? …)` calls are preserved as the sources actually wrote them; `dce_of_iife/diff.md` divergence vs esbuild shrunk.

## Issues fixed

Pulled in from upstream oxc fixes between 0.132.0 and 0.133.0:

- Fixes #9437 — Pure-annotated IIFE returning an array is inlined (via oxc [#22547](oxc-project/oxc#22547))
- Fixes #9494 — Cross-module enum inlining mis-folds auto-increment member to 0 (via oxc [#22646](oxc-project/oxc#22646))
- Fixes #9408 — Rolldown drops `@preserve` from `/* #__PURE__ -- @preserve */` comments (via oxc [#22525](oxc-project/oxc#22525) "preserve verbatim text of pure/no-side-effects comments")
- Partially addresses #8688 — `dce/dce_of_iife` esbuild divergence shrunk (via oxc [#22589](oxc-project/oxc#22589) "drop empty-body IIFE wrapper when called with arguments")

Also relevant: oxc [#22566](oxc-project/oxc#22566) reduces false-positive `PureNotApplied` warnings (the warnings system landed in rolldown #9381) by applying PURE through member-access chains.

## Test plan

- [x] `cargo check --workspace --all-targets`
- [x] `just update-generated-code`
- [x] `just test-update`
- [x] `just ued`
- [x] `just roll` — 32 Rust suites pass (0 failed), 1212 node tests pass (0 failed), TS lint+types clean on 804 files

---------

Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-codegen Area - Code Generation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants