subroutine in lookbehind
Bug Description
Attempting a subroutine in a lookbehind on PCRE the website reports that the subroutine cannot be used in the lookbehind because it makes it not fixed width (\g<1> This token can not be used in a lookbehind as it makes it non-fixed width), but it does not appear that the website is checking the subroutine to see if it is fixed width or not. PCRE documentation states that subroutine's are allowed as long as the subroutine will be fixed width, and that recursion is not allowed.
Reproduction steps
REGEX:
(?=@?(['\x{2018}-\x{201B}\"\x{201C}-\x{201E}])(?:(?<!@\g<1>)|\s*$))
Expected Outcome
Working regex that will match (at the start of) ' or @' but only if @' is the last meaningful thing on the line. (' can be replaced by any of the quote characters, as this detects the start of a quoted string (any kind) in PowerShell)
Browser
Edge, Windows 10 19H1 Insiders
You are correct, and this is a bug, but I wonder if its feasible to test for it. The current implementation is quite naive, and I'm not sure if I can extend it to properly check for this. I am also unsure of the potential compute time it might take (reference might reference another group, and another, etc).
Question is, perhaps I should allow for subroutine calls to be made in lookbehinds, and let the engine catch any potential errors. Unfortunately, you won't get any help from the editor in those cases.
Would the error of 'look-behind contains non-fixed length expression' get forwarded when the engine catches the error? That's the error I would get when I would attempt to use a non-fixed length subroutine (recursive or not) in the look-behind and then I would have to figure out where. Sure it would be nice if it could show me exactly were the non-fixed length expression is at, but probably not entirely needed.
Do you already track a property such as 'group is fixed length' (or actually 'all alternates are fixed length') as you evaluate an expression for errors, If you always compiled that information, and it already resolved all sub-groups, by the time you would check the \g<x> in the look behind you would just be able to check if x.IsFixedLength, which would ultimately drive the IsFixedLength property of the look-behind group itself.
Do you have any means to offer warnings/cautions instead of terminating errors? A warning that 'subroutine in look-behind must resolve to fixed length expression in order to be successful' wouldn't hurt.
The current implementation for this is not something I'm very happy with. The group width calculation is something I would like to redo, and if I do that, it would be possible to accurately detect this.
However, as of right now, perhaps its best to allow these constructs in the lookbehind, and add a little warning notice if they are placed inside lookbehinds.
Hello, there is nothing wrong in allowing the subroutine call in the lookbehind and let pcre catch the error:
Question is, perhaps I should allow for subroutine calls to be made in lookbehinds, and let the engine catch any potential errors. Unfortunately, you won't get any help from the editor in those cases.
You should do this rather than being false positive.