-
Notifications
You must be signed in to change notification settings - Fork 1.2k
OS#14763260: Correctly update RegExp.$1 after RegExp.prototype.test matches and used a cached value #4607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OS#14763260: Correctly update RegExp.$1 after RegExp.prototype.test matches and used a cached value #4607
Conversation
…atches and used a cached value Regression from chakra-core#3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails. Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed. Fixes: https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
| Assert(!lastMatch.IsUndefined()); | ||
| ScriptContext* scriptContext = this->GetScriptContext(); | ||
| const CharCount lastInputLen = lastInput->GetLength(); | ||
| const wchar_t* lastInputStr = lastInput->GetString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use WCHAR instead. It refers to char16 on xplat. (why? wchar_t is 4 bytes on some platforms while 2 bytes on others)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced it with char16 since all the APIs here are using char16. Seems this must be fine for xplat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also fine! Thanks
dilijev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM but @akroshg should have a look too.
| { | ||
| AssertMsg(lastPattern != nullptr, "lastPattern should not be null"); | ||
| AssertMsg(lastInput != nullptr, "lastInput should not be null"); | ||
| AssertMsg(JavascriptOperators::GetTypeId(lastInput) != TypeIds_Null, "lastInput should not be JavaScript null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these particular checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check similar things in JavascriptRegExpConstructor::SetLastMatch which is close to what we're doing here. Just kept it consistent.
| this->lastPattern = lastPattern; | ||
| this->lastInput = lastInput; | ||
| this->lastMatch.Reset(); | ||
| this->reset = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "reset" in the two lines above have the same meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We want to reset the last match to undefined and let the regex constructor know it has to reset it's cache via JavascriptRegExpConstructor::EnsureValues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between reset and invalidateLastMatch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing flag reset is used to indicate we should reset the cache backed-by JavascriptRegExpConstructor::EnsureValues. I added invalidateLastMatch which is used to indicate that there is no value stored in the last match field. This means we need to first calculate the last match before we use it. Ordinarily we assert that the last match field in JavascriptRegExpConstructor::EnsureValues is not undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reset flag is used in the logic for materializing all of the RegExp['$blah'] properties as actual JavascriptStrings (when needed) rather than input + offset which is faster to compute, and the logic ensures this only happens once. I'm not sure how the term "reset" applies to this logic but this code is apparently very old.
test/Regex/bug_OS14763260.js
Outdated
| r2.test(s2); | ||
| r1.test(s1); | ||
|
|
||
| assert.areEqual("abc", RegExp.$1, "Stale last match should be invalidated by second r1.test(s1)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other side-effects that regex matching can have? It might be good to add a few more tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add a few more tests.
|
|
||
| Field(UnifiedRegex::RegexPattern*) lastPattern; | ||
| Field(JavascriptString*) lastInput; | ||
| Field(bool) invalidatedLastMatch; // true if last match must be recalculated before use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move it down with other bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
| // - There was a match (test returned true) | ||
| if (invalidatedLastMatch) | ||
| { | ||
| this->lastMatch = RegexHelper::SimpleMatch(scriptContext, pattern, lastInputStr, lastInputLen, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SimpleMatch [](start = 47, length = 11)
could this function be re-entrant and change lastInput by any chance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope not. We don't pass any Var here, just the const char16* so I don't think user can apply getter/setters or use Symbol overrides. Seems as safe as the other operations here.
test/Regex/bug_OS14763260.js
Outdated
|
|
||
| r1.test(s1); | ||
|
|
||
| assert.areEqual("abc", RegExp.input, "RegExp.input property caclculated correctly"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: spelling of calculated in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch
…r RegExp.prototype.test matches and used a cached value Merge pull request #4607 from boingoing:FixRegExpTestCache Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails. Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed. Fixes: https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
|
Thanks, everyone |
…xp.$1 after RegExp.prototype.test matches and used a cached value Merge pull request #4607 from boingoing:FixRegExpTestCache Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails. Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed. Fixes: https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
…y update RegExp.$1 after RegExp.prototype.test matches and used a cached value Merge pull request #4607 from boingoing:FixRegExpTestCache Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails. Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed. Fixes: https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.
Fixed by adding an invalidation mechanism to the
JavascriptRegExpConstructor. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260