Skip to content

<regex>: std::regex should not use unsigned int to describe input lengths #439

@BillyONeal

Description

@BillyONeal

Describe the bug
Our std::regex implementation uses unsigned int everywhere to store lengths and offsets in the input, which fails on 64 bit platforms where the inputs can be greater than 4GiB.

Command-line test case
STL version (git commit or Visual Studio version): Visual Studio 2019 version 16.4
This repro must be run on a 64 bit platform.

C:\Users\bion\Desktop>type test.cpp
#include <assert.h>
#include <regex>
#include <stddef.h>
#include <stdio.h>
#include <string>

void test(size_t number) {
  printf("Testing %zu...\n", number);
  try {
    std::string s(number, 'a');
    const auto pattern = "a{" + std::to_string(number) + "}";
    std::regex r(pattern.c_str());
    std::smatch results;
    assert(regex_match(s, results, r));
    assert(results.size() == 1);
    const auto &firstResult = results[0];
    assert(static_cast<size_t>(firstResult.second - firstResult.first) ==
           number);
  } catch (const std::exception &e) {
    puts(e.what());
  }
}

int main() {
  test(0xFFFF'FFFFull);
  test(0x1'0000'0000ull);
  puts("pass");
}

C:\Users\bion\Desktop>cl /EHsc /W4 /WX /nologo /O2 .\test.cpp
test.cpp

C:\Users\bion\Desktop>.\test.exe
Testing 4294967295...
regex_error(error_complexity): The complexity of an attempted match against a regular expression exceeded a pre-set level.
Testing 4294967296...
Assertion failed: regex_match(s, results, r), file .\test.cpp, line 14

C:\Users\bion\Desktop>

Expected behavior
These matches should complete in a reasonable amount of time, no exceptions should be thrown, and the asserts should pass.

Also tracked by Microsoft-internal VSO-177615 / AB#177615.

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregexmeow is a substring of homeownervNextBreaks binary compatibility

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions