New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: parse mode chars should not be considered chars #13975
base: main
Are you sure you want to change the base?
Conversation
They are simply considered part of the group start.
| * Holds if a parse mode starts between `start` and `end`. | ||
| */ | ||
| private predicate flag_group_start(int start, int end) { | ||
| exists(int no_modes_end | |
Check warning
Code scanning / CodeQL
Omittable 'exists' variable Warning
in this argument
| @@ -7,3 +7,6 @@ | |||
| # Treatment of line breaks | |||
| re.compile(r'(?:.|\n)*b') # No ReDoS. | |||
| re.compile(r'(?:.|\n)*b', re.DOTALL) # Has ReDoS. | |||
| re.compile(r'(?i)(?:.|\n)*b') # No ReDoS. | |||
| re.compile(r'(?s)(?:.|\n)*b') # Has ReDoS. | |||
| re.compile(r'(?is)(?:.|\n)*b') # Has ReDoS. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add the following as a test:
re.compile(r'(?is)X(?:.|\n)*Y')The message for that should ideally mention that the attack string should start with an X.
If the message doesn't say that, then the "start" of the regex is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent suggestion, thanks! The message does not mention X, I will look into why..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that (?is) is still considered a group, but with no children.
It shouldn't be a group, it shouldn't appear as a RegExpTerm at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FirstItem does find the X..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe as a test revert this commit: e2de0e6
That allows you to view the tree structure of the parsed regular expression in the ast-viewer in VSCode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..but NfaUtils::prefix does not..
| @@ -105,3 +105,6 @@ | |||
| | redos.py:391:15:391:25 | (\\u0061\|a)* | This part of the regular expression may cause exponential backtracking on strings starting with 'X' and containing many repetitions of 'a'. | | |||
| | unittests.py:5:17:5:23 | (\u00c6\|\\\u00c6)+ | This part of the regular expression may cause exponential backtracking on strings starting with 'X' and containing many repetitions of '\u00c6'. | | |||
| | unittests.py:9:16:9:24 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | |||
| | unittests.py:11:20:11:28 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | |||
| | unittests.py:12:21:12:29 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | |||
| | unittests.py:13:22:13:30 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings starting with 'x' and containing many repetitions of '\\n'. | | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how the regex uses an upper-case X, but the alert-message uses a lower-case x (because it's case-insensitive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test outcomes look good to me now 👍
I can't really comment on the parser architecture.
This is a PR on top of #13779 to aid in the discussion around parse mode characters.