Expected behavior
When Sublime's custom regexp engine handles a regexp, it should behave identically to Oniguruma.
Actual behavior
Oniguruma has a quirk when parsing isolated options (e.g. (?i)) that Sublime does not replicate. When Oniguruma encounters isolated options, the remainder of the enclosing group (or of the expression, if there is no enclosing group) is implicitly grouped. For instance, the following expressions are equivalent:
The documentation is less than clear, and this behavior is unintuitive, but it is consistent. I suppose that option groups are parsed with the same precedence as the | operator.
Sublime's custom regexp engine, however, will interpret that expression differently, so that the following are equivalent:
As a result, the same construct may be interpreted differently depending on whether the expression triggers the Oniguruma engine or uses the native Sublime engine. This is confusing. In addition, this is an obstacle to third-party implementations and other tools.
Sample syntax
%YAML 1.2
---
name: Test Option Parsing
scope: source.test-option-parsing
contexts:
main:
- match: a(?i)b|c
scope: region.redish
- match: (?:d(?i)e)|f
scope: region.redish
# Force Oniguruma
- match: u(?i)v|w(?<!0)
scope: region.bluish
- match: x(?i:y|z)(?<!0)
scope: region.bluish
Sample input
ab
ac
c
de
df
f
uv
uw
w
xy
xz
z
Notes
The core HTML syntax inadvertently relies upon this bug. I will submit a PR to correct that.
A suggested best practice to avoid this issue is to avoid isolated options, except at the very beginning of an expression (and never in variables). Instead, use noncapturing groups with flags. For example, instead of a(?i)b, use a(?i:b).
Expected behavior
When Sublime's custom regexp engine handles a regexp, it should behave identically to Oniguruma.
Actual behavior
Oniguruma has a quirk when parsing isolated options (e.g.
(?i)) that Sublime does not replicate. When Oniguruma encounters isolated options, the remainder of the enclosing group (or of the expression, if there is no enclosing group) is implicitly grouped. For instance, the following expressions are equivalent:The documentation is less than clear, and this behavior is unintuitive, but it is consistent. I suppose that option groups are parsed with the same precedence as the
|operator.Sublime's custom regexp engine, however, will interpret that expression differently, so that the following are equivalent:
As a result, the same construct may be interpreted differently depending on whether the expression triggers the Oniguruma engine or uses the native Sublime engine. This is confusing. In addition, this is an obstacle to third-party implementations and other tools.
Sample syntax
Sample input
Notes
The core HTML syntax inadvertently relies upon this bug. I will submit a PR to correct that.
A suggested best practice to avoid this issue is to avoid isolated options, except at the very beginning of an expression (and never in
variables). Instead, use noncapturing groups with flags. For example, instead ofa(?i)b, usea(?i:b).