Improve email regexp on edge cases#10601
Conversation
- Drastically improves performance on cases like `"<" + " " * N` - Last spaces are not needed anyway because this group is stripped later. Also spaces will be caught by `.` anyway.
|
please review |
CodSpeed Performance ReportMerging #10601 will not alter performanceComparing Summary
|
|
How performance changes?
About 25x speed improvement. |
Viicos
left a comment
There was a problem hiding this comment.
Thanks, this looks reasonable but to be extra careful we'll wait for other reviews as well.
Could you add the following to the test_address_valid test?:
('Samuel Colvin < s@muelcolvin.com>', 'Samuel Colvin', 's@muelcolvin.com'),
('Samuel Colvin <s@muelcolvin.com >', 'Samuel Colvin', 's@muelcolvin.com'),
('Samuel Colvin < s@muelcolvin.com >', 'Samuel Colvin', 's@muelcolvin.com'),Co-authored-by: Victorien <65306057+Viicos@users.noreply.github.com>
According to Wikipedia it is one of the valid DoS attack vectors. And at least some of known to me rate limiters will work only after validation step.
I think that existing tests are already covering this edge cases (spaces before/after the group). Should I still add yours? ('foo BAR <foobar@example.com >', 'foo BAR', 'foobar@example.com'),
('FOO bar <foobar@example.com> ', 'FOO bar', 'foobar@example.com'),
('Whatever < foobar@example.com>', 'Whatever', 'foobar@example.com'), |
|
Yep let's add those extra tests and fix the lints, but otherwise LGTM. |
I agree, just wanted to be careful as changing regex can be a source of breaking changes.
Missed these ones, then maybe only add these ones after it: ('Whatever <foobar@example.com >', 'Whatever', 'foobar@example.com'),
('Whatever < foobar@example.com >', 'Whatever', 'foobar@example.com'), |
Covering name + email case with spaces surrounding the email
Done. |
sydney-runkle
left a comment
There was a problem hiding this comment.
Nice, thanks for the help here! We appreciate the thorough explanations / refs :).
"<" + " " * N.anyway.Change Summary
I found that one single change in email regexp solves slowdowns on special invalid email strings. See related issue for details
Related issue number
Fixes #10600
Checklist
Selected Reviewer: @sydney-runkle