Skip to content

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Jan 16, 2026

Summary

Replace O(n*m) PikeVM verification with O(n) DFA verification for multiline suffix patterns.

Issue: #99 (Rust regex 84x faster on (?m)^/.*\.php)

Root Cause (from research)

The 84x performance gap was NOT due to missing DFA state acceleration. Research revealed:

  1. Rust hybrid DFA does NOT use per-state acceleration — only dense DFA does
  2. coregex already has partial acceleration in dfa/lazy/
  3. Real bottleneck: MultilineReverseSuffixSearcher used PikeVM (O(n*m)) instead of DFA (O(n))

Changes

  • MultilineReverseSuffixSearcher.forwardDFA replaces pikevm field
  • Uses lazy.DFA.SearchAtAnchored() for linear-time anchored matching
  • lazy.CompileWithConfig() creates forward DFA with proper config

Performance

Case Before After Speedup
No-match (2KB) 1136 ns 108 ns 10.5x
Long no-match 25937 ns 197 ns 131x
Large input (6MB) 66 ms ~5-10 ms 10-30x (expected)

Testing

  • All existing tests pass
  • Linter clean (golangci-lint)
  • No regressions on other patterns
  • regex-bench CI validation (after merge)

Technical Details

Research document: docs/dev/research/DFA_STATE_ACCELERATION_RESEARCH.md

Key insight from Rust code (hybrid/search.rs):

The hybrid DFA search does NOT call any acceleration-related functions. It uses loop unrolling and tagged state ID checks for performance.

Fixes #99 (Phase 1)

Replace O(n*m) PikeVM verification with O(n) DFA verification for
multiline suffix patterns like (?m)^/.*\.php

Changes:
- MultilineReverseSuffixSearcher.forwardDFA replaces pikevm field
- Uses lazy.DFA.SearchAtAnchored() for anchored matching
- lazy.CompileWithConfig() creates forward DFA with proper config

Performance:
- No-match cases: 10-131x faster (microseconds vs milliseconds)
- Large input (6MB): expected 10-30x improvement

Research showed Rust hybrid DFA also uses DFA (not per-state
acceleration) for verification - this is the correct approach.
@github-actions
Copy link

Benchmark Comparison

Comparing main → PR #100

Summary: geomean 252.2n 252.2n -0.01%

⚠️ Potential regressions detected:

geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
AhoCorasickVsStdlib/coregex_Count-4                     1.470µ ± ∞ ¹    1.477µ ± ∞ ¹     +0.48% (p=0.032 n=5)
AnchoredAlt_ManyBranches_Coregex/GET-4                  70.95n ± ∞ ¹    73.20n ± ∞ ¹     +3.17% (p=0.008 n=5)
FirstByteReject_Stdlib-4                                40.44n ± ∞ ¹    40.57n ± ∞ ¹     +0.32% (p=0.008 n=5)
ConcurrentIsMatch-4                                     52.52n ± ∞ ¹    52.77n ± ∞ ¹     +0.48% (p=0.016 n=5)
IPRegex_Find/stdlib_64KB_sparse-4                       110.6µ ± ∞ ¹   1618.6µ ± ∞ ¹  +1363.92% (p=0.008 n=5)
IPRegex_Find/coregex_64KB_sparse-4                      2.405µ ± ∞ ¹    3.883µ ± ∞ ¹    +61.46% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit fe0bc83 into main Jan 16, 2026
15 checks passed
@kolkov kolkov deleted the feature/dfa-verification-multiline branch January 16, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: Multiline patterns 84x slower than Rust regex

2 participants