Skip to content

IPv6 Zone Identifiers are not correctly parsed (still) #6282

@benizl

Description

@benizl

Issue #5775 notes that IPv6 Zone Identifiers are not parsed correctly when the Zone ID is itself a valid percent-encoded character from the UNRESERVED_SET. That issue was incorrectly closed as a duplicate of #5126, in fact #5126 is to do with a different, resolved, bug in urllib. This ticket here is a duplicate of #5775 but commenting on that ticket is now locked.

IPv6 addresses can have the form fe80::1:2:3:4%zone where zone is any alphanumeric sequence and is platform-dependent. In order to address their use in URLs, where the %zone could be interpreted as a percent-escaped character, RFC6874 requires the % to be replaced with its own percent-escaped representation %25, e.g. http://[fe80::1:2:3:4%25zone.

In requests, this is not enough to protect the Zone ID in the URL, if the zone is a valid percent-escaped character from the UNRESERVED_SET. Specifically, url.py::_normalize_host removes the RFC6874 sequence and replaces it with a simple %, then the round trip through quote/unquote_reserved in utils.py::requote_uri called from PreparedRequest::prepare_url transforms the Zone ID in to the percent-escaped character anyway.

Note that doubly escaping the percent works, but the URL is then neither intuitive nor RFC-compliant.

Expected Result

requests.Request('GET', 'http://[fe80::1:2:3:4%61]').prepare().url -> Undefined/don't care (not RFC-compliant)
requests.Request('GET', 'http://[fe80::1:2:3:4%2561]').prepare().url -> 'http://[fe80::1:2:3:4%61]'
requests.Request('GET', 'http://[fe80::1:2:3:4%252561]').prepare().url -> 'http://[fe80::1:2:3:4%2561]'

Actual Result

requests.Request('GET', 'http://[fe80::1:2:3:4%61]').prepare().url -> 'http://[fe80::1:2:3:4a]'
requests.Request('GET', 'http://[fe80::1:2:3:4%2561]').prepare().url -> 'http://[fe80::1:2:3:4a]'
requests.Request('GET', 'http://[fe80::1:2:3:4%252561]').prepare().url -> 'http://[fe80::1:2:3:4%61]'

System Information

$ python -m requests.help
{
  "chardet": {
    "version": null
  },
  "charset_normalizer": {
    "version": "2.1.0"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.3"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.10.5"
  },
  "platform": {
    "release": "22.1.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.28.1"
  },
  "system_ssl": {
    "version": "1010111f"
  },
  "urllib3": {
    "version": "1.26.12"
  },
  "using_charset_normalizer": true,
  "using_pyopenssl": false
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions