Skip to content

Regression in @babel/plugin-transform-named-capturing-groups-regex with unicode regexps #10601

@Jessidhia

Description

@Jessidhia

Bug Report

Current Behavior
Likely a regression from #10430, but the contents of u regexps are getting transformed to being non-u without updating a regexp's flags. The problem becomes a more serious bug when the s flag is also present, as the transformed . no longer matches newlines.

Input Code

const re = /<(?<tag>\d)+>.*?<\/\k<tag>>/su
console.log(re.test('<0>xxx\nyyy</0>')) // should be true

Expected behavior/code
A clear and concise description of what you expected to happen (or code).

Output with @7.4.5 (ignoring helpers, just the regexp):

/<(\d)+>.*?<\/\1>/su

Output with @7.6.3:

/<([0-9])+>(?:[\0-\t\x0B\f\x0E-\u2027\u202A-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF])*?<\/\1>/su

Note how the regexp, which didn't even have to be transformed from being a unicode regexp (and didn't remove the u flag after doing it), specifically skips over both \x0A and \x0D, the \n and \r characters, which are supposed to be matched because the regexp has the s flag.

Babel Configuration (.babelrc, package.json, cli command)

{
  "plugins": ["plugin-transform-named-capturing-groups-regex"]
}

Environment

  • Babel version(s): 7.6.4
  • Node/npm version: v12.10.0 / yarn 1.19.1
  • OS: macOS 10.15
  • Monorepo: yes
  • How you are using Babel: babel-jest

Possible Solution
Need some way to tell regexpu to not do the unicode downleveling; or if unicode transformation is "inevitable" the dotAll flag needs to be respected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Has PRoutdatedA closed issue/PR that is archived due to age. Recommended to make a new issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions