-
Notifications
You must be signed in to change notification settings - Fork 4
Closed
Description
Bug Description
Case-insensitive patterns like (?i)HELLO fail to match lowercase text hello when unanchored.
Reproduction
package main
import (
"fmt"
"regexp"
"github.com/coregx/coregex"
)
func main() {
pattern := `(?i)HELLO`
input := "hello"
// stdlib - WORKS
reStd := regexp.MustCompile(pattern)
fmt.Println("stdlib:", reStd.FindAllString(input, -1)) // [hello]
// coregex - FAILS
reCg := coregex.MustCompile(pattern)
fmt.Println("coregex:", reCg.FindAllString(input, -1)) // []
}Root Cause
The literal extractor in literal/extractor.go:129-135 ignores the FoldCase flag from syntax.Regexp. It extracts HELLO as a literal prefix, and the prefilter searches for HELLO case-sensitively, missing hello.
Impact
All unanchored case-insensitive patterns are affected:
(?i)abconABCworks (finds first match)(?i)abconABCabcfails (misses lowercaseabc)
Anchored patterns work correctly because they bypass prefilter.
Proposed Fix
Option 1: Don't extract literals when FoldCase flag is set (safe, simple)
Option 2: Expand case-insensitive literals into all variations (complex)
Metadata
Metadata
Assignees
Labels
No labels