Skip to content

Investigate regex lazy quantifier with optional group regression in .NET 10#125330

Draft
Copilot wants to merge 3 commits intomainfrom
copilot/fix-regex-lazy-quantifier
Draft

Investigate regex lazy quantifier with optional group regression in .NET 10#125330
Copilot wants to merge 3 commits intomainfrom
copilot/fix-regex-lazy-quantifier

Conversation

Copy link
Contributor

Copilot AI commented Mar 9, 2026

Description

Status: Investigation only — no fix implemented yet.

Regex pattern a(b.*?c)?d fails to match abccd in .NET 10, a regression from .NET 9. The bug occurs when a lazy quantifier (*?) with min=0 using Setlazy/Notonelazy (character class or negated-char loops) is inside an optional group (...)?.

var regex = new Regex(@"a(b.*?c)?d");
var match = regex.Match("abccd");
// .NET 9: match.Success == true
// .NET 10: match.Success == false

Investigation findings

  • Confirmed reproduction: Setlazy/Notonelazy with M=0 inside optional groups fail; Onelazy with M=0 works; M≥1 works
  • Bytecode analysis: The failing pattern's compiled bytecodes contain Setjump/Forejump (atomic group markers) wrapping the outer ? loop, preventing backtracking into the inner .*?
  • Root cause area: FindAndMakeLoopsAtomic in RegexNode.cs, introduced by commit c0f4c7dd39 ("Support auto-atomicity on {lazy} loops", Support auto-atomicity on {lazy} loops #117943). CanBeMadeAtomic for general Loop nodes checks only the loop body's first/last character classes against the subsequent node, but doesn't account for characters matchable by internal lazy quantifiers. The MayContainBacktracking guard (Fix incorrect atomic loop optimization when body contains backtracking #124254) should block this but the atomic wrapper still appears in bytecodes through a path not yet identified.

What remains

  • Pinpoint the exact code path inserting the atomic wrapper despite MayContainBacktracking returning true
  • Implement the fix
  • Add test cases to Regex.Match.Tests.cs
  • Verify existing tests pass

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 9, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot AI changed the title [WIP] Fix regex lazy quantifier issue in .NET 10 Investigate regex lazy quantifier with optional group regression in .NET 10 Mar 9, 2026
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regex lazy quantifier with optional group behaves differently in .NET 10 vs .NET 9

2 participants