fix(parser): attach legal comments to following token#21670
fix(parser): attach legal comments to following token#21670graphite-app[bot] merged 1 commit intomainfrom
Conversation
Merging this PR will improve performance by 3.23%
Performance Changes
Comparing Footnotes
|
0e6b8ed to
5bc3fc0
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Always treat legal comments as leading comments of the following token, which matches esbuild's behavior. |
Merge activity
|
## Summary
- treat `Legal` and `JsdocLegal` comments as leading comments for the following token instead of allowing them to become trailing comments on the previous token
- apply that rule to both legal block comments and legal line comments after code
- add parser regressions for both cases and a codegen idempotency regression for comment-preserving minify mode
## Why
`monitor-oxc` whitespace found a repeated print/parse instability in comment-preserving whitespace removal. Some bundled files contain legal comments such as `@license`/`@preserve` or `/*! ... */` between two statements. After one print, Oxc could produce a shape where a legal comment appeared after code and before the next token. On the second parse, the trivia builder sometimes classified that legal comment as trailing trivia for the previous token.
That classification is wrong for legal-comment preservation. Oxc only treats legal comments as preservable legal comments when they are leading comments. So the parser should keep legal comments attachable to the following token instead of finalizing them as trailing comments.
This makes repeated parse/print cycles stable in modes that remove whitespace but keep comments, without changing executable JavaScript semantics. Ordinary non-legal comments still use the existing leading/trailing classification.
## esbuild comparison
I checked esbuild's implementation because Oxc's legal-comment behavior is intended to follow the same model. esbuild's lexer records legal comments in `LegalCommentsBeforeToken`, and the parser consumes that list before parsing the next statement by inserting `SComment { IsLegalComment: true }` statements. In other words, esbuild models legal comments as comments before the next token/statement, not as trailing trivia attached to the previous token.
For the concrete repro, esbuild with `minify: true` and `legalComments: "inline"` keeps the legal comments inline and is idempotent across repeated transforms:
```js
foo();/**
* @license MIT
**//*! #__NO_SIDE_EFFECTS__ */function bar(){}
```
The key invariant from esbuild is not the exact newline placement. It is that legal comments remain associated with the following statement/token for preservation. This change moves Oxc's parser classification toward that invariant while leaving ordinary non-legal trailing comments alone.
## Testing
- `cargo test -p oxc_parser trivia_builder -- --nocapture`
- `cargo test -p oxc_codegen comments -- --nocapture`
- `cargo fmt --all --check`
- `git diff --check`
- `monitor-oxc whitespace` filtered to the two previously failing files (`@vue-macros/devtools` and `@asamuzakjp/css-color`): both files were collected and processed with no `RemoveWhitespace` diagnostics. The local runtime phase then failed separately under Node.js v25.8.1 on old package imports.
74be544 to
aaabde4
Compare
### 💥 BREAKING CHANGES - 0ffbe0d allocator: [**BREAKING**] Remove `Allocator::end_ptr` method (#21871) (overlookmotel) ### 🚀 Features - 9593ec8 transformer/jsx: Add jsxDEV source metadata for fragments (#21932) (Ido Rosenthal) ### 🐛 Bug Fixes - 429deac napi/parser: Export `visitorKeys` from `wasm` entrypoint (#21996) (NullVoxPopuli) - e852911 codegen: Preserve legal comments orphaned by upstream passes (#21575) (Dunqing) - e3399ec transformer/class-properties: Preserve RHS in logical-assignment to static private field (#21950) (Dunqing) - c59c199 transformer/typescript: Emit class fields for parameter properties (#21831) (Dunqing) - aaabde4 parser: Attach legal comments to following token (#21670) (Dunqing) ### ⚡ Performance - 0bf0cb9 allocator: Per-platform `Arena::new_fixed_size` implementations (#22088) (overlookmotel) ### 📚 Documentation - 62ec410 allocator: Correct doc comment for `Allocator::from_raw_parts` (#22093) (overlookmotel) - 3e152c6 allocator: Correct typos in comments (#22092) (overlookmotel) - e220855 allocator: Correct doc comment for `Allocator::set_cursor_ptr` (#21866) (overlookmotel) --------- Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com> Co-authored-by: Cameron Clark <cameron.clark@hey.com>
Summary
LegalandJsdocLegalcomments as leading comments for the following token instead of allowing them to become trailing comments on the previous tokenWhy
monitor-oxcwhitespace found a repeated print/parse instability in comment-preserving whitespace removal. Some bundled files contain legal comments such as@license/@preserveor/*! ... */between two statements. After one print, Oxc could produce a shape where a legal comment appeared after code and before the next token. On the second parse, the trivia builder sometimes classified that legal comment as trailing trivia for the previous token.That classification is wrong for legal-comment preservation. Oxc only treats legal comments as preservable legal comments when they are leading comments. So the parser should keep legal comments attachable to the following token instead of finalizing them as trailing comments.
This makes repeated parse/print cycles stable in modes that remove whitespace but keep comments, without changing executable JavaScript semantics. Ordinary non-legal comments still use the existing leading/trailing classification.
esbuild comparison
I checked esbuild's implementation because Oxc's legal-comment behavior is intended to follow the same model. esbuild's lexer records legal comments in
LegalCommentsBeforeToken, and the parser consumes that list before parsing the next statement by insertingSComment { IsLegalComment: true }statements. In other words, esbuild models legal comments as comments before the next token/statement, not as trailing trivia attached to the previous token.For the concrete repro, esbuild with
minify: trueandlegalComments: "inline"keeps the legal comments inline and is idempotent across repeated transforms:The key invariant from esbuild is not the exact newline placement. It is that legal comments remain associated with the following statement/token for preservation. This change moves Oxc's parser classification toward that invariant while leaving ordinary non-legal trailing comments alone.
Testing
cargo test -p oxc_parser trivia_builder -- --nocapturecargo test -p oxc_codegen comments -- --nocapturecargo fmt --all --checkgit diff --checkmonitor-oxc whitespacefiltered to the two previously failing files (@vue-macros/devtoolsand@asamuzakjp/css-color): both files were collected and processed with noRemoveWhitespacediagnostics. The local runtime phase then failed separately under Node.js v25.8.1 on old package imports.