-
Notifications
You must be signed in to change notification settings - Fork 4
Closed
Description
Bug Description
Empty character classes like [^\S\s] (which logically match nothing) incorrectly match empty strings instead of failing.
Reproduction
package main
import (
"fmt"
"regexp"
"github.com/coregx/coregex"
)
func main() {
pattern := `[^\S\s]`
input := "abc"
// stdlib - correct (no match)
reStd := regexp.MustCompile(pattern)
fmt.Println("stdlib:", reStd.MatchString(input)) // false
// coregex - incorrect (matches!)
reCg := coregex.MustCompile(pattern)
fmt.Println("coregex:", reCg.MatchString(input)) // true
}Root Cause
In nfa/compile.go:365-367, when compileCharClass receives an empty rune slice, it calls compileEmptyMatch():
if len(ranges) == 0 {
return c.compileEmptyMatch() // WRONG!
}compileEmptyMatch() creates an epsilon transition that matches empty string. But an empty character class should never match - it's an impossible condition.
Semantics
[\S\s]= any character = OpAnyChar[^\S\s]= NOT (any character) = empty set = matches nothing- Go's parser correctly sets
Rune: []for empty classes
Proposed Fix
Add compileNoMatch() function that creates an NFA fragment with no path from start to end, making it impossible to match. Use this for empty character classes instead of compileEmptyMatch().
Metadata
Metadata
Assignees
Labels
No labels