Update line_char to tolerate <regex> round-trip conversions#478
Update line_char to tolerate <regex> round-trip conversions#478StephanTLavavej wants to merge 2 commits intobuild2:masterfrom StephanTLavavej:better-regex
line_char to tolerate <regex> round-trip conversions#478Conversation
|
Thanks for the PR and the detailed description of what's going on. Will look into it shortly and get back to you. I also agree that what we do here is unfortunate but at the moment we don't see a better way to provide the same functionality (line-level regex) without heroic measures (like writing our own regex implementation or forking |
|
Ok, this has been merged with a few tweaks as 68f80c8 The only tweak other than comments is to make the change in the While the change to We've also CI'ed it on the current MSVC's STL version as well as all the other STL implementations we support and there are no issues. Let me know if you are happy with the end result and I will close this PR. I can also stage this change if you would like to try the "official" distribution (and maybe replace the old version in your "Real World Code" test suite). |
Yeah, this is the point of these casts. They are used used to check whether the value of a character type can be represented by an integer of type The non-special |
Yes, this is great, thanks! I've verified that your merged changes work with my updated STL.
Yes, that would be very helpful for our test team. |
Ok, here you go: https://stage.build2.org/0/0.18.0-a.0/
Thanks again for your help on this. |
In MSVC's STL, we recently merged microsoft/STL#5592 to remove non-Standard
_Uelemmachinery from the<regex>parser. This broke build2, which we regularly build in our "Real World Code" test suite with development versions of MSVC.The issue is that build2's use of a custom character type leads to an extremely brittle dependence on the internal implementation of
<regex>, as indicated by the numerous comments in your codebase about the details of what MSVC's STL and libstdc++ happened to be doing. While parsing a regex, we need to compare its characters against escape sequences likeR"(\bcats\b)"(word boundary assertions, for example). Our recent PR did this by constructing the user-defined character type fromchars like'b'. This is rejected by build2's assertions.While I believe that build2 should strongly consider refactoring this entire code, I've created an additional MSVC STL PR that adjusts our comparisons to avoid requesting larger changes from build2: microsoft/STL#5675. Instead of constructing user-defined character types from
char(and then comparing homogeneously), we'll directly compare againstchar. This is less theoretically pure, but uses build2's heterogeneous comparison operators, avoiding most problems.However, we still need to detect whether user-defined character types can round-trip through
char(to avoid being confused by truncation). This PR here is a small change to support that. Inregex.hxx, it applies whatoperator char ()is already doing, tooperator T (). This allows all user-defined characters to be implicitly converted, returning\a(BELL) for your "non-special" type. (MSVC's STL needs this now because we sometimes convert to types other thancharspecifically, as part of our_Uelemremoval/overhaul.) Then inregex.cxx, I'm adjustingline_char's constructor to accept this\a(BELL) character, to accept round-tripping without asserting.This appears to pass build2's tests locally on my machine, with the updated MSVC STL.