Skip to content

Regex caching prevents repeated warnings #8938

@headius

Description

@headius

Because we have various mechanisms for caching previously-compiled regular expressions, we may not always trigger warnings expected by Ruby tests and specs.

While investigating failures caused by repeatedly running specs in #8930, I discovered that the following form of regexp will only warn for the first compilation.

$ jruby -X-C -e '4.times { p eval "/foo(A{0,1}+)Abar/" }'
-e:1: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/

Even though the evaluated code is being re-compiled for each loop, we have already cached the compiled regular expression and will no longer report the warning. Compare to CRuby:

$ cx 3.4 ruby -e '4.times { p eval "/foo(A{0,1}+)Abar/" }'
(eval at -e:1):1: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression: /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
(eval at -e:1):1: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression: /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
(eval at -e:1):1: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression: /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
(eval at -e:1):1: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression: /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/

The warning is emitted by our regular expression engine, JOni, so we cannot force it to fire again once we have a compiled regular expression in hand.

This also affects cases where the expression is repeated as separate tokens in the source code, indicating that the caching of that regexp is happening outside the AST (probably in our LRU regexp cache):

$ jruby -X-C -e 'p /foo(A{0,1}+)Abar/; p /foo(A{0,1}+)Abar/; p /foo(A{0,1}+)Abar/; p /foo(A{0,1}+)Abar/'
(unknown):0: warning: nested repeat operator '?' and '+' was replaced with '*' in regular expression /foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/
/foo(A{0,1}+)Abar/

The RubySpec involved is shown here (from language/regexp/repetition_spec.rb):

  it "does not treat {m,n}+ as possessive" do
    -> {
      @regexp = eval "/foo(A{0,1}+)Abar/"
    }.should complain(/nested repeat operator/)
    @regexp.match("fooAAAbar").to_a.should == ["fooAAAbar", "AA"]
  end

While this is a pretty minor behavioral difference, it prevents running this spec repeatedly and getting the expected warning each time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions