Issue #5775 notes that IPv6 Zone Identifiers are not parsed correctly when the Zone ID is itself a valid percent-encoded character from the UNRESERVED_SET. That issue was incorrectly closed as a duplicate of #5126, in fact #5126 is to do with a different, resolved, bug in urllib. This ticket here is a duplicate of #5775 but commenting on that ticket is now locked.
IPv6 addresses can have the form fe80::1:2:3:4%zone where zone is any alphanumeric sequence and is platform-dependent. In order to address their use in URLs, where the %zone could be interpreted as a percent-escaped character, RFC6874 requires the % to be replaced with its own percent-escaped representation %25, e.g. http://[fe80::1:2:3:4%25zone.
In requests, this is not enough to protect the Zone ID in the URL, if the zone is a valid percent-escaped character from the UNRESERVED_SET. Specifically, url.py::_normalize_host removes the RFC6874 sequence and replaces it with a simple %, then the round trip through quote/unquote_reserved in utils.py::requote_uri called from PreparedRequest::prepare_url transforms the Zone ID in to the percent-escaped character anyway.
Note that doubly escaping the percent works, but the URL is then neither intuitive nor RFC-compliant.
Expected Result
requests.Request('GET', 'http://[fe80::1:2:3:4%61]').prepare().url -> Undefined/don't care (not RFC-compliant)
requests.Request('GET', 'http://[fe80::1:2:3:4%2561]').prepare().url -> 'http://[fe80::1:2:3:4%61]'
requests.Request('GET', 'http://[fe80::1:2:3:4%252561]').prepare().url -> 'http://[fe80::1:2:3:4%2561]'
Actual Result
requests.Request('GET', 'http://[fe80::1:2:3:4%61]').prepare().url -> 'http://[fe80::1:2:3:4a]'
requests.Request('GET', 'http://[fe80::1:2:3:4%2561]').prepare().url -> 'http://[fe80::1:2:3:4a]'
requests.Request('GET', 'http://[fe80::1:2:3:4%252561]').prepare().url -> 'http://[fe80::1:2:3:4%61]'
System Information
$ python -m requests.help
{
"chardet": {
"version": null
},
"charset_normalizer": {
"version": "2.1.0"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "3.3"
},
"implementation": {
"name": "CPython",
"version": "3.10.5"
},
"platform": {
"release": "22.1.0",
"system": "Darwin"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.28.1"
},
"system_ssl": {
"version": "1010111f"
},
"urllib3": {
"version": "1.26.12"
},
"using_charset_normalizer": true,
"using_pyopenssl": false
}
Issue #5775 notes that IPv6 Zone Identifiers are not parsed correctly when the Zone ID is itself a valid percent-encoded character from the
UNRESERVED_SET. That issue was incorrectly closed as a duplicate of #5126, in fact #5126 is to do with a different, resolved, bug in urllib. This ticket here is a duplicate of #5775 but commenting on that ticket is now locked.IPv6 addresses can have the form
fe80::1:2:3:4%zonewherezoneis any alphanumeric sequence and is platform-dependent. In order to address their use in URLs, where the%zonecould be interpreted as a percent-escaped character, RFC6874 requires the%to be replaced with its own percent-escaped representation%25, e.g.http://[fe80::1:2:3:4%25zone.In
requests, this is not enough to protect the Zone ID in the URL, if the zone is a valid percent-escaped character from theUNRESERVED_SET. Specifically,url.py::_normalize_hostremoves the RFC6874 sequence and replaces it with a simple%, then the round trip throughquote/unquote_reservedinutils.py::requote_uricalled fromPreparedRequest::prepare_urltransforms the Zone ID in to the percent-escaped character anyway.Note that doubly escaping the percent works, but the URL is then neither intuitive nor RFC-compliant.
Expected Result
Actual Result
System Information
{ "chardet": { "version": null }, "charset_normalizer": { "version": "2.1.0" }, "cryptography": { "version": "" }, "idna": { "version": "3.3" }, "implementation": { "name": "CPython", "version": "3.10.5" }, "platform": { "release": "22.1.0", "system": "Darwin" }, "pyOpenSSL": { "openssl_version": "", "version": null }, "requests": { "version": "2.28.1" }, "system_ssl": { "version": "1010111f" }, "urllib3": { "version": "1.26.12" }, "using_charset_normalizer": true, "using_pyopenssl": false }