Skip to content

fix(es/regexp): implement transform-named-capturing-groups-regex#11642

Merged
kdy1 merged 5 commits intomainfrom
claude/issue-11608-20260303-2335
Mar 12, 2026
Merged

fix(es/regexp): implement transform-named-capturing-groups-regex#11642
kdy1 merged 5 commits intomainfrom
claude/issue-11608-20260303-2335

Conversation

@kdy1
Copy link
Copy Markdown
Member

@kdy1 kdy1 commented Mar 11, 2026

Previously, when transform-named-capturing-groups-regex was enabled,
the RegexpPass detected named capturing groups and converted the regex
literal to a RegExp() constructor call, but didn't actually strip the
named groups from the pattern.

This commit properly implements the transform by:

  • Parsing the regex AST to find all named capturing groups
  • Stripping group names ((?<name>...) -> (...))
  • Converting named backreferences (\k<name> -> \N)
  • Building a name-to-index mapping object
  • Wrapping the result with _wrap_reg_exp(regex, mapping) helper

Closes #11608

Generated with Claude Code

Previously, when `transform-named-capturing-groups-regex` was enabled,
the RegexpPass detected named capturing groups and converted the regex
literal to a `RegExp()` constructor call, but didn't actually strip the
named groups from the pattern. This meant the resulting code still
contained `(?<name>...)` syntax, which older browsers don't support.

This commit properly implements the transform by:
- Parsing the regex AST to find all named capturing groups
- Stripping group names (`(?<name>...)` -> `(...)`)
- Converting named backreferences (`\k<name>` -> `\N`)
- Building a name-to-index mapping object
- Wrapping the result with `_wrap_reg_exp(regex, mapping)` helper

The `_wrap_reg_exp` helper creates a RegExp subclass that:
- Overrides `exec()` to populate `result.groups` from the mapping
- Overrides `Symbol.replace` to handle `$<name>` substitutions

Closes #11608

Co-authored-by: Donny/강동윤 <kdy1@users.noreply.github.com>
@kdy1 kdy1 requested review from a team as code owners March 11, 2026 05:07
Copilot AI review requested due to automatic review settings March 11, 2026 05:07
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 11, 2026

🦋 Changeset detected

Latest commit: 45a5e99

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 11, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ kdy1
❌ github-actions[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@claude

This comment has been minimized.

@claude

This comment has been minimized.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82f52c8990

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +193 to +197
let props = mapping
.into_iter()
.map(|(name, idx)| {
PropOrSpread::Prop(Box::new(Prop::KeyValue(KeyValueProp {
key: PropName::Ident(IdentName {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit array mappings for duplicate named captures

Patterns that reuse the same group name in different alternatives (supported by this parser, e.g. (?<x>a)|(?<x>b)) lose correctness here because each (name, idx) is emitted as a separate object property, and duplicate keys keep only the last index. The new _wrap_reg_exp helper already has array-handling logic for duplicate names, but this mapping never provides arrays, so result.groups[name]/$<name> can resolve to undefined for branches that populate an earlier capture.

Useful? React with 👍 / 👎.

Comment on lines +381 to +383
let replacement = if let Term::NamedReference(nr) = &*term {
mapping.iter().find(|(n, _)| *n == nr.name).map(|(_, idx)| {
Term::IndexedReference(Box::new(IndexedReference {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve duplicate-name backreference semantics

Named backreferences are rewritten by taking the first matching name from mapping, which is incorrect when the same name appears multiple times across alternatives. For cases like (?:(?<x>a)|(?<x>b))\k<x>, this always rewrites \k<x> to the first numeric group, changing which inputs match whenever a later x capture is the one that participated.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the previously-missing transform-named-capturing-groups-regex behavior by rewriting named capture syntax in regex literals and wrapping the result with a runtime helper to preserve groups semantics.

Changes:

  • Add _wrap_reg_exp helper (both Rust-injected helper source and @swc/helpers ESM entry) and register it in the helpers registry.
  • Implement named capture group stripping + named backreference conversion in RegexpPass using swc_ecma_regexp AST parsing.
  • Add/adjust fixture coverage for preset-env and the reported issue (#11608).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/helpers/package.json Exposes the new helper entrypoint via package exports.
packages/helpers/esm/_wrap_reg_exp.js Adds the ESM helper implementation for wrapping regexes with group mappings.
crates/swc_ecma_transforms_base/src/helpers/mod.rs Registers wrap_reg_exp in the Rust helper system with dependencies.
crates/swc_ecma_transforms_base/src/helpers/_wrap_reg_exp.js Adds the injected helper source used by Rust transforms.
crates/swc_ecma_transformer/src/regexp.rs Implements named capture group parsing/rewriting and emits _wrap_reg_exp(...) calls.
crates/swc_ecma_preset_env/tests/fixtures/transform/named-capturing-groups-regex/input.mjs New preset-env input fixture for named groups/backrefs.
crates/swc_ecma_preset_env/tests/fixtures/transform/named-capturing-groups-regex/output.mjs Expected output fixture using _wrap_reg_exp with stripped patterns and mappings.
crates/swc_ecma_preset_env/tests/fixtures/transform/named-capturing-groups-regex/options.json Preset-env targets configuration for the new fixture.
crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs Updates expected output to use _wrap_reg_exp for named groups under usage mode.
crates/swc/tests/fixture/issues-11xxx/11608/input/input.js Regression fixture input reproducing #11608.
crates/swc/tests/fixture/issues-11xxx/11608/input/.swcrc Fixture config enabling external helpers + named groups transform.
crates/swc/tests/fixture/issues-11xxx/11608/output/input.js Expected output requiring and using the new helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread packages/helpers/esm/_wrap_reg_exp.js Outdated
this,
str,
substitution.replace(/\$<([^>]+)>/g, function(_, name) {
var group = groups[name];
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol.replace string-substitution branch returns "$" + group even when groups[name] is missing/undefined, producing "$undefined". This should handle unknown group names safely (Babel’s helper returns an empty string for missing mappings) to avoid incorrect replacement output or runtime errors.

Suggested change
var group = groups[name];
var group = groups && groups[name];
if (group === undefined) return "";

Copilot uses AI. Check for mistakes.
this,
str,
substitution.replace(/\$<([^>]+)>/g, function(_, name) {
var group = groups[name];
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the injected helper version of _wrap_reg_exp, the Symbol.replace string-substitution branch will produce "$undefined" if groups[name] is missing. It should treat unknown group names as an empty string (matching Babel’s behavior) and/or guard against missing mappings to prevent incorrect replacement output.

Suggested change
var group = groups[name];
var group = groups ? groups[name] : undefined;
if (group === undefined) return "";

Copilot uses AI. Check for mistakes.
Comment on lines +681 to +686
"./_/_wrap_reg_exp": {
"module-sync": "./esm/_wrap_reg_exp.js",
"webpack": "./esm/_wrap_reg_exp.js",
"import": "./esm/_wrap_reg_exp.js",
"default": "./cjs/_wrap_reg_exp.cjs"
},
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages/helpers/esm/index.js is auto-generated (per its header) and currently does not export _wrap_reg_exp, even though this PR adds the helper + an export-map entry. To keep generated artifacts consistent, regenerate helpers (e.g. npm run build in packages/helpers) and commit the updated generated files (notably esm/index.js).

Suggested change
"./_/_wrap_reg_exp": {
"module-sync": "./esm/_wrap_reg_exp.js",
"webpack": "./esm/_wrap_reg_exp.js",
"import": "./esm/_wrap_reg_exp.js",
"default": "./cjs/_wrap_reg_exp.cjs"
},

Copilot uses AI. Check for mistakes.
Comment on lines +382 to +387
mapping.iter().find(|(n, _)| *n == nr.name).map(|(_, idx)| {
Term::IndexedReference(Box::new(IndexedReference {
span: nr.span,
index: *idx,
}))
})
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In strip_named_groups_term, the name comparison uses *n == nr.name. swc_atoms::Atom is not Copy, so dereferencing &Atom like this will move the value and fail to compile. Compare by reference instead (e.g. n == &nr.name) or adjust the mapping lookup to avoid moving Atom out of the iterator item.

Suggested change
mapping.iter().find(|(n, _)| *n == nr.name).map(|(_, idx)| {
Term::IndexedReference(Box::new(IndexedReference {
span: nr.span,
index: *idx,
}))
})
mapping
.iter()
.find(|(n, _)| n == &nr.name)
.map(|(_, idx)| {
Term::IndexedReference(Box::new(IndexedReference {
span: nr.span,
index: *idx,
}))
})

Copilot uses AI. Check for mistakes.
kodiakhq[bot]
kodiakhq Bot previously approved these changes Mar 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 11, 2026

Binary Sizes

File Size
swc.linux-x64-gnu.node 28M (28754504 bytes)

Commit: 59c3c51

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 11, 2026

Merging this PR will not alter performance

✅ 218 untouched benchmarks


Comparing claude/issue-11608-20260303-2335 (45a5e99) with main (0f24ee1)

Open in CodSpeed

@kdy1
Copy link
Copy Markdown
Member Author

kdy1 commented Mar 11, 2026

@claude fix ci

@claude

This comment has been minimized.

@claude

This comment has been minimized.

Update expected test outputs to reflect the named capturing groups
transform now being properly applied.

Co-authored-by: Donny/강동윤 <kdy1@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 11, 2026 23:58
@kdy1 kdy1 review requested due to automatic review settings March 11, 2026 23:58
kodiakhq[bot]
kodiakhq Bot previously approved these changes Mar 11, 2026
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Mar 11, 2026

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn Critical
Critical CVE: npm cipher-base is missing type checks, leading to hash rewind and passing on crafted data

CVE: GHSA-cpq7-6gpm-g9rc cipher-base is missing type checks, leading to hash rewind and passing on crafted data (CRITICAL)

Affected versions: < 1.0.5

Patched version: 1.0.5

From: ?npm/cipher-base@1.0.4

ℹ Read more on: This package | This alert | What is a critical CVE?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Remove or replace dependencies that include known critical CVEs. Consumers can use dependency overrides or npm audit fix --force to remove vulnerable dependencies.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore npm/cipher-base@1.0.4. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

Warn High
Obfuscated code: npm buffer is 96.0% likely obfuscated

Confidence: 0.96

Location: Package overview

From: ?npm/buffer@4.9.2

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore npm/buffer@4.9.2. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

@claude

This comment has been minimized.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c78a9d006

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +197 to +200
key: PropName::Ident(IdentName {
span: DUMMY_SP,
sym: name,
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit mapping keys safely for __proto__ group names

Building the name-to-index map with PropName::Ident makes a capture named __proto__ emit as { __proto__: 1 }, which is treated as a prototype setter instead of a normal own property. In that case _wrap_reg_exp cannot find the group key via Object.keys, so result.groups["__proto__"] and $<__proto__> semantics diverge from native named groups. Please emit non-special keys (for example computed/string keys) so __proto__ is preserved as data.

Useful? React with 👍 / 👎.

Comment on lines +29 to +30
var group = groups[name];
return "$" + (Array.isArray(group) ? group.join("$") : group);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve replace semantics for missing named placeholders

When the replacement string contains $<name> that is not present in the regex groups, native behavior with named captures is to substitute an empty string, but this callback currently rewrites it to "$undefined". That changes results for cases like /(?<x>a)/ with replacement "$<y>", which should produce "" but will now produce "$undefined" after transform.

Useful? React with 👍 / 👎.

@kdy1
Copy link
Copy Markdown
Member Author

kdy1 commented Mar 12, 2026

@claude Reflect #11642 (review)

@claude

This comment has been minimized.

…ansform

- Deduplicate needs_transform logic by reusing needs_regexp_constructor()
- Guard g.name = None with is_some() check for clarity
- Rename BabelRegExp to WrappedRegExp in helper functions
- Handle undefined group in Symbol.replace callback (return "" instead of "$undefined")
- Add edge case tests: mixed named/unnamed groups, nested groups, alternatives

Co-authored-by: Donny/강동윤 <kdy1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 12, 2026 00:52
@kdy1 kdy1 review requested due to automatic review settings March 12, 2026 00:52
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 12, 2026

PR Review: fix(es/regexp): implement transform-named-capturing-groups-regex

Good work on implementing the core named-capturing-groups transform. The two-pass approach (collect then strip) is clean, and the test coverage for regex literals is solid. Here are some issues I found:


Bug: Missing _wrap_reg_exp import in corejs3/usage-regexp output

crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:4

The output uses _wrap_reg_exp(...) but it's never imported or defined. The other test outputs correctly show an import from @swc/helpers, but this fixture (which uses useBuiltIns: "usage" with corejs 3) doesn't get the helper import injected. This will produce a ReferenceError at runtime.

// Current output (broken):
var a = _wrap_reg_exp(RegExp("(\\d{4})-(\\d{2})-(\\d{2})", "u"), { ... });

// Expected: should have an import for the helper
import { _ as _wrap_reg_exp } from "@swc/helpers/_/_wrap_reg_exp";

Gap: Named groups in RegExp() constructor calls not transformed

The Expr::Call and Expr::New branches (lines 305-318) for RegExp(...) / new RegExp(...) only call transform_regexp_args which handles unicode property escapes. Named capturing groups in constructor-form regexes like new RegExp("(?<year>\\d{4})") are not transformed.

While this is an edge case (most people use regex literals), Babel does handle it. At minimum, consider adding a comment noting this as a known limitation, or track it as a follow-up.


Minor: (?< false positive in the initial quick-check

crates/swc_ecma_transformer/src/regexp.rs:222

if self.options.named_capturing_groups_regex && regex.exp.contains("(?<") {

This will also match lookbehind assertions (?<=...) and (?<!...), triggering an unnecessary parse + extract attempt that will find no named groups and return None. Not a correctness issue (it falls through correctly), but a minor perf hit. Consider using a more precise check like:

regex.exp.contains("(?<") && !regex.exp.starts_with("(?<=") // too simple

Or just leave a comment explaining the false positive is benign.


Helper: Missing Symbol.matchAll override (minor)

The _wrap_reg_exp helper overrides exec and Symbol.replace, but not Symbol.matchAll. For String.prototype.matchAll(), the spec calls Symbol.matchAll which creates an iterator that calls exec internally, so the overridden exec should handle it. However, this depends on the engine's internal implementation path. Babel's wrapRegExp helper similarly relies on exec for this, so this is probably fine — but worth noting.


Nit: Redundant conditional

crates/swc_ecma_transformer/src/regexp.rs:392-394

if g.name.is_some() {
    g.name = None;
}

The if guard is unnecessary — assigning None to an already-None field is a no-op. Can simplify to:

g.name = None;

Overall

The core transform logic (AST-based group collection, stripping, and backreference rewriting) is well-structured. The test cases cover important edge cases (nested groups, mixed named/unnamed, alternations, backreferences). The main actionable item is the missing helper import in the corejs3 test output which indicates a real bug.

@kdy1 kdy1 merged commit f62bfa9 into main Mar 12, 2026
202 of 203 checks passed
@kdy1 kdy1 deleted the claude/issue-11608-20260303-2335 branch March 12, 2026 01:23
@github-actions github-actions Bot added this to the Planned milestone Mar 12, 2026
@github-actions github-actions Bot modified the milestones: Planned, 1.15.21 Mar 22, 2026
@swc-project swc-project locked as resolved and limited conversation to collaborators Apr 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

transform-named-capturing-groups-regex not worked

3 participants