Allow horizontal tabs \t in response header values#2345
Allow horizontal tabs \t in response header values#2345jhominal merged 3 commits intoKludex:masterfrom
\t in response header values#2345Conversation
|
The main question I have is whether I should be updating the CHANGELOG.md now or if that will be on the person who will be preparing the release to do that. |
I'll do it. |
| @@ -26,7 +26,7 @@ | |||
| from uvicorn.server import ServerState | |||
|
|
|||
| HEADER_RE = re.compile(b'[\x00-\x1f\x7f()<>@,;:[]={} \t\\"]') | |||
There was a problem hiding this comment.
Should the name allow it as well, or is this good?
There was a problem hiding this comment.
There was a problem hiding this comment.
I suspect that, in the case of header names, the above regular expression does not match anything that is actually valid…
I will take another look at the new HTTP RFCs
There was a problem hiding this comment.
Sorry for the late comment.
In short:
RFC 9110 has the following relevant bits:
field-name = token
And in Section 5.6.1 Tokens:
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
; any VCHAR, except delimiters
- A check of the
tcharwhitelist from RFC 9110 shows that it does not overlap at all with the blacklist of theHEADER_REregular expression. Thus, there is no reason for there to be an issue where someone's valid header name would be blocked by this check; - Only 2 ASCII characters are not covered by either the whitelist from RFC 9110, or the
HEADER_REblacklist:/and?. I can add them to theHEADER_REregex if you would like to; - In principle, any byte above 0x80 would be forbidden by the standard too. Should they also be added to
HEADER_RE?
There was a problem hiding this comment.
In other words, I think that the HEADER_RE regular expression for header names is fine, in that it does not block any standard-valid input.
|
Another question @Kludex @tomchristie : would you be opposed to adding, in the exception messages, a repr of the invalid value? (note: h11 does that) |
| async def test_response_header_splitting(http_protocol_cls: HTTPProtocol): | ||
| app = Response(b"", headers={"key": "value\r\nCookie: smuggled=value"}) | ||
|
|
||
| protocol = get_connected_protocol(app, http_protocol_cls) | ||
| protocol.data_received(SIMPLE_GET_REQUEST) | ||
| await protocol.loop.run_one() | ||
| assert b"HTTP/1.1 500 Internal Server Error" not in protocol.transport.buffer | ||
| assert b"\r\nCookie: smuggled=value\r\n" not in protocol.transport.buffer | ||
| assert protocol.transport.is_closing() |
There was a problem hiding this comment.
What's this test about? This was already passing before.
There was a problem hiding this comment.
While working on this PR, I did not find a test that checked that invalid response header values were blocked by uvicorn (preventing, among other issues, header response splitting).
Thus, this test fulfills two purposes for me:
- It blocks me from implementing "allow horizontal tabs" by removing the regular expression check, either wilfully or inadvertently;
- It confirms that there is no need to go and change the
h11implementation for header splitting issues;
In other words, I believed that there was no other test that was covering that bit of functionality, so I decided to add one.
|
If you want, we can make a release right away. Please create a bump PR, if you want. Also, feel free to merge this. |
Summary
This PR allows for
\tto be returned in response header values by uvicorn whenhttptoolsare in use.As explained in #2344, both the letter of the current spec (RFC 9210), and the
h11implementation, allow tab characters to be present in HTTP header values.Checklist