bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler #18284
Conversation
|
This fix looks good to me! |
| @@ -937,7 +937,7 @@ class AbstractBasicAuthHandler: | |||
|
|
|||
| # allow for double- and single-quoted realm values | |||
| # (single quotes are a violation of the RFC, but appear in the wild) | |||
| rx = re.compile('(?:.*,)*[ \t]*([^ \t]+)[ \t]+' | |||
| rx = re.compile('(?:[^,]*,)*[ \t]*([^ \t]+)[ \t]+' | |||
serhiy-storchaka
Jan 30, 2020
Member
(?:.*,)* is equivalent to (?:.*,)?.
But since this regular expresion is only used with search(). (?:.*,)*[ \t]* can be removed at all.
I'll analyze whether it is correct or there is an error in the regular expression.
(?:.*,)* is equivalent to (?:.*,)?.
But since this regular expresion is only used with search(). (?:.*,)*[ \t]* can be removed at all.
I'll analyze whether it is correct or there is an error in the regular expression.
serhiy-storchaka
Jan 30, 2020
Member
Well, I am cannot say that I completely understand the code, but to give it some sense we can either
- Replace
rx.search() with rx.match() and replace (?:.*,)* with (?:.*,)?.
or
- Keep
rx.search() and replace (?:.*,)* with (?:^|,).
Do not keep (?:[^,]*,)*. It is a waster of resources.
Well, I am cannot say that I completely understand the code, but to give it some sense we can either
- Replace
rx.search()withrx.match()and replace(?:.*,)*with(?:.*,)?.
or
- Keep
rx.search()and replace(?:.*,)*with(?:^|,).
Do not keep (?:[^,]*,)*. It is a waster of resources.
serhiy-storchaka
Jan 30, 2020
Member
Humm, options 1 and 2 are not equivalent if the field value contains more than one challenge. Option 2 is closer to the current behavior. But correct support of more than one challenge need rewriting the code.
Humm, options 1 and 2 are not equivalent if the field value contains more than one challenge. Option 2 is closer to the current behavior. But correct support of more than one challenge need rewriting the code.
davidfraser
Feb 17, 2020
The current patch seems to give an O(n^3) time to evaluate - much better than O(2^n), but still very slow - with 2000 commas it takes about a minute to evalute. With 65000 it takes much, much longer. Testing using the code from here gave the following (commas, seconds) values:
[(100, 0.124), (250, 0.261), (500, 0.923), (750, 2.85), (1000, 6.433), (1250, 12.608), (1500, 21.576), (2000, 50.751)]
The current patch seems to give an O(n^3) time to evaluate - much better than O(2^n), but still very slow - with 2000 commas it takes about a minute to evalute. With 65000 it takes much, much longer. Testing using the code from here gave the following (commas, seconds) values:
[(100, 0.124), (250, 0.261), (500, 0.923), (750, 2.85), (1000, 6.433), (1250, 12.608), (1500, 21.576), (2000, 50.751)]
serhiy-storchaka
Mar 24, 2020
Member
I was wrong, the first option is equivalent to the current behavior (returns the last realm).
I was wrong, the first option is equivalent to the current behavior (returns the last realm).
|
Would be nice to add a test. |
|
Just a heads up, CVE-2020-8492 has been created. I'm not sure how Python CVEs are generally tracked, but it may be useful to include the information on the bug tracker issue |
|
Not only
We can either simplify the regex to prevent the "catastrophic backtracking" or even remove the prefix. UPDATE: Oops, my example was wrong, I fixed it :-) |
|
Does this also fix https://bugs.python.org/issue38826 ? |
It seems to me that the >>> header = 'basic realm="1", x, other realm="2"'
>>> re.search("(?:.*,)*[ \t]*([^ \t]+)[ \t]+", header).group(1)
'other'
>>> re.search("[ \t]*([^ \t]+)[ \t]+", header).group(1)
'basic'
>>> I don't see a way to fix this by just changing the regex while preserving the previous behavior. Then again, corner cases of the previous behavior might be wrong. |
7c4bae4
to
cc59fb1
|
I rebased my PR and added more tests. |
|
@serhiy-storchaka: I don't understand if you consider that the fix is wrong or that the fix is not enough (it remains possible to create a denial of service)? |
|
@vstinner your fix helps, but we can do better. It has cubic complexity, my suggestion has quadratic complexity. It is possible to implement an algorithm with linear complexity, but not with such small changes. |
|
I also added some tests and implemented a simpler complexity regex - see master...davidfraser:urllib_basic_auth_regex |
|
It's worth seeing how the results of this regex are actually used Note my comment in f79379c: Note that the original regex was roughly O(2**n) |
|
WWW-Authenticate is badly specified. The RFC doesn't specify if a single HTTP header can contain multiple challenges. I found these resources:
A variant is to have multiple WWW-Authenticate: one challenge per WWW-Authenticate header. By the way, AbstractBasicAuthHandler code contains this interesting comment:
Current behavior:
For example, |
cc59fb1
to
477be6e
Sorry, I misunderstood this proposition. In fact, I proposed something similar except that I missed the "start of the string" (regex I decided to write a way more complex change to not only fix the vulnerability, but also fix the parser since it didn't look possible to fix the regex without changing the behavior. Currently, the code uses the last realm if there are multiple challenges per header. I fixed this behavior to use the realm of the first Basic challenge. I also modified the code to support multiple headers, except of only parsing the first one. |
137cf0b
to
ee8ff4f
ee8ff4f
to
3c7dae4
3c7dae4
to
c461645
|
Ok, the PR is now ready for a new round of reviews. I fixed the vulnerability but I also changed the code to parse all WWW-Authenticate HTTP Headers and accept multiple challenges per header. |
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4) Co-authored-by: Victor Stinner <vstinner@python.org>
|
GH-19292 is a backport of this pull request to the 3.7 branch. |
|
|
Thanks @vstinner for the PR |
|
Thanks @vstinner for the PR |
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4) Co-authored-by: Victor Stinner <vstinner@python.org>
|
GH-19296 is a backport of this pull request to the 3.8 branch. |
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4) Co-authored-by: Victor Stinner <vstinner@python.org>
|
GH-19297 is a backport of this pull request to the 3.7 branch. |
…-19296) The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> (cherry picked from commit 0b297d4)
…-19297) The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> (cherry picked from commit 0b297d4)
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4)
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4)
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4)
…-19304) The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 0b297d4)
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
…9305) The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge.
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking.
Vulnerability reported by Matt Schwager.
https://bugs.python.org/issue39503