Skip to content

refactor(semantic): identify illegal function declarations without using AST nodes#22972

Merged
graphite-app[bot] merged 1 commit into
mainfrom
om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes
Jun 5, 2026
Merged

refactor(semantic): identify illegal function declarations without using AST nodes#22972
graphite-app[bot] merged 1 commit into
mainfrom
om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes

Conversation

@overlookmotel

@overlookmotel overlookmotel commented Jun 5, 2026

Copy link
Copy Markdown
Member

Closes: #22934

Alternative to #22934.

The problem

check_function_redeclaration (the Annex B.3.3 legacy duplicate-function rule) distinguished a plain FunctionDeclaration from an async/generator one by reading the previous declaration's AST node through its stored NodeId:

let prev_function = ctx.nodes.kind(prev.declaration).as_function();
if prev_function.is_some_and(|f| !(f.r#async || f.generator)) { return; }

That node is a popped sibling (not an ancestor of the current node), so it's the last build-time read that would prevent making AST node storage optional.

Starting point

In real-world code, no-one uses function redeclarations. They are completely pointless. We only need to handle this case to be conformant with the spec, just to be able to say we're conformant with the spec.

Further, the specific circumstances in which this Annex B rule applies are even more rare. All the following must be satisfied:

  • Sloppy-mode code.
  • Function redeclarations in a nested block (not top-level or nested in another function).
  • JS not TS.

Therefore, we can assume the code which handles this case is unreachable in practice. The priority is to make recording the data required for the check as cheap as possible - in terms of perf, and memory usage.

How

Record the one bit of information we need at bind time, while the node is still in hand. When binding an async/generator function declaration in a sloppy-mode block, push its NodeId into a Vec on
SemanticBuilder. The check then answers "was the previous declaration async/generator?" by searching this Vec.

The check only applies in the specific circumstances listed above. So don't push values to the Vec in modules, strict mode, top-level code, or TypeScript. In practice, the Vec will always remain empty.

The checking code in check_redeclared_function does a linear scan of the Vec, which is expensive, but as function redeclarations in nested blocks in sloppy mode JS are vanishingly rare in real-world code, this doesn't matter.

Additionally, improve the errors when a violation is found - the error points to the offending async/generator function, rather than stopping at the immediately preceding declaration. That only adds diagnostics on code that was already invalid - conformance is unchanged.

@github-actions github-actions Bot added the A-semantic Area - Semantic label Jun 5, 2026

overlookmotel commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@codspeed-hq

codspeed-hq Bot commented Jun 5, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 57 untouched benchmarks
⏩ 9 skipped benchmarks1


Comparing om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes (b3a87da) with main (742fd0b)

Open in CodSpeed

Footnotes

  1. 9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@overlookmotel overlookmotel force-pushed the om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes branch from 1fabb2f to b3a87da Compare June 5, 2026 03:07
@overlookmotel overlookmotel marked this pull request as ready for review June 5, 2026 03:07
@overlookmotel overlookmotel requested a review from Dunqing as a code owner June 5, 2026 03:07
Copilot AI review requested due to automatic review settings June 5, 2026 03:07
@overlookmotel overlookmotel requested a review from Boshen June 5, 2026 03:11

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the JavaScript Annex B.3.3 function-redeclaration check to avoid reading a previous declaration’s AST node via NodeId, enabling future work where AST node storage can be optional. It does this by recording, at bind time, which function declarations were async/generator in sloppy-mode block scopes (JS-only), and consulting that build-only scratch data during the redeclaration check; it also improves diagnostics by pointing at the earlier async/generator declaration when that’s what makes the redeclaration illegal.

Changes:

  • Replace previous-declaration AST-node inspection in check_function_redeclaration with a scan over build-time recorded async/generator declaration NodeIds.
  • Add a build-only Vec<NodeId> scratch field on SemanticBuilder to track async/generator function declarations in sloppy-mode JS block scopes.
  • Populate that scratch field during function binding under the narrow conditions where Annex B.3.3 applies.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
crates/oxc_semantic/src/checker/javascript.rs Updates Annex B.3.3 redeclaration logic to avoid AST node reads and to report the earlier offending async/generator declaration span.
crates/oxc_semantic/src/builder.rs Adds build-only scratch storage (async_or_generator_function_node_ids) on SemanticBuilder.
crates/oxc_semantic/src/binder.rs Records qualifying async/generator function declaration node IDs during binding for later use by the checker.

@graphite-app

graphite-app Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Merge activity

…ing AST nodes (#22972)

Closes: #22934

Alternative to #22934.

### The problem

`check_function_redeclaration` (the Annex B.3.3 legacy duplicate-function rule) distinguished a plain `FunctionDeclaration` from an `async`/generator one by reading the _previous_ declaration's AST node through its stored `NodeId`:

```rust
let prev_function = ctx.nodes.kind(prev.declaration).as_function();
if prev_function.is_some_and(|f| !(f.r#async || f.generator)) { return; }
```

That node is a popped sibling (not an ancestor of the current node), so it's the last build-time read that would prevent making AST node storage optional.

### Starting point

In real-world code, no-one uses function redeclarations. They are completely pointless. We only need to handle this case to be conformant with the spec, just to be able to say we're conformant with the spec.

Further, the specific circumstances in which this Annex B rule applies are even more rare. All the following must be satisfied:

- Sloppy-mode code.
- Function redeclarations in a nested block (not top-level or nested in another function).
- JS not TS.

Therefore, we can assume the code which handles this case is unreachable in practice. The priority is to make recording the data required for the check as cheap as possible - in terms of perf, and memory usage.

### How

Record the one bit of information we need at bind time, while the node is still in hand. When binding an `async`/generator function declaration in a sloppy-mode block, push its `NodeId` into a `Vec` on
`SemanticBuilder`. The check then answers "was the previous declaration async/generator?" by searching this `Vec`.

The check only applies in the specific circumstances listed above. So don't push values to the `Vec` in modules, strict mode, top-level code, or TypeScript. In practice, the `Vec` will always remain empty.

The checking code in `check_redeclared_function` does a linear scan of the `Vec`, which is expensive, but as function redeclarations in nested blocks in sloppy mode JS are vanishingly rare in real-world code, this doesn't matter.

Additionally, improve the errors when a violation _is_ found - the error points to the offending `async`/generator function, rather than stopping at the immediately preceding declaration. That only adds diagnostics on code that was already invalid - conformance is unchanged.
@graphite-app graphite-app Bot force-pushed the om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes branch from b3a87da to 594ce58 Compare June 5, 2026 04:07
graphite-app Bot pushed a commit that referenced this pull request Jun 5, 2026
…d]` function (#22973)

#22972 made the code which handles function redeclarations heavier. The assumption of that PR is that function redeclarations are extremely rare. So move all the expensive code into a separate `#[cold]` function, so that the fast checks in `check_function_redeclaration` (the hot path) can be inlined into the caller.
@graphite-app graphite-app Bot merged commit 594ce58 into main Jun 5, 2026
29 checks passed
@graphite-app graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label Jun 5, 2026
@graphite-app graphite-app Bot deleted the om/06-05-refactor_semantic_identify_illegal_function_declarations_without_using_ast_nodes branch June 5, 2026 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-semantic Area - Semantic

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants