Conversation
CodSpeed Performance ReportMerging #12413 will degrade performances by 2.87%Comparing Summary
Benchmarks breakdown
|
glyph
left a comment
There was a problem hiding this comment.
Request's biggest problem is a sprawling public interface with waaay too many attributes, and I would really prefer that we avoid adding even more of those; particularly not mutable public attributes plus class vars which may or may not be honored in specific parts of the lifecycle of the request and may therefore represent inaccurate values depending on when they are consulted.
There is also an inherent problem here where only the code that configures the request at the top of the request hierarchy specifies the parseBody configuration, and resources lower down in the tree may or may not be able to function correctly with an empty args during a post request, creating a sort of silent-failure type error.
That said… doing something here without designing a completely new fancy interface that solves every problem at once will at least get us un-stuck.
The @type annotation on twisted.web.http.Request says that args is unconditionally a dict, but, technically that has always been a lie; args starts as None and only takes on a non-None value at a specific point in the lifecycle.
So, we can weasel our way into a change which:
- is technically compatible
- looks a lot like what you've written here
- doesn't blow up in scope
- doesn't create silent failure conditions for resources that handle POSTs
by:
- accepting
parseBodyas a constructor argument and only a constructor argument; storing the state privately - adding a for-real type annotation to
argsto note that it may be (and may always have been)None - documenting when
argsmight beNone - making the official check for "you didn't parse any arguments" to be
request.args is None
This does still leave an edge case open, for code that looks like this:
def render_POST(self, request):
if not request.args:
deleteEverything()
else:
doUpdates(request.args)
I think that is probably fine since it was technically already broken in some weird edge cases and you will not opt yourself into this behavior by accident; somebody has to pass the new argument explicitly to get the new behavior. The other option here is to make it simply incompatible and delete the args attribute entirely in this situation, which I would also be fine with but may be a bit too belt-and-suspenders a change.
|
I will get back to this next week, but... pretty sure |
|
And I don't think "breaking child Resources" is always a problem. It's a big switch, but the main use case would be applications that control all their Resources (Tahoe-LAFS, Synapse). |
I think, on balance, that this is a worthwhile tradeoff to start getting some forward motion on this perpetual morass. But I also really don't like that these applications that "control" all their own resources will have no warning if something is broken. The type checker can't tell you, you don't get an error at runtime, you just find that parameters are mysteriously missing when they were previously requested. The ideal solution here would be to have a marker interface that would let you indicate that you take something other than |
|
I am also +1 for doing something here. Maybe we can have this implementes using a private attribute that is set by the request fatory. Ultimately, I think that the ideal API is to allow setting these flags via your own We currently have I already have something like this implemented in my fork of twisted.web. What I am doing is:
|
adiroiban
left a comment
There was a problem hiding this comment.
Changes looks good.
I think that this is a good step toward improving the Twisted HTTP server.
I think that is best to keep the Request._parseBody API private and only expose the API via the Site.
Thanks again!
|
OK, this is basically the same design (all or nothing flag, no indication that it has been set unfortunately) but with no public attributes, just additional constructor parameters. |
|
please review |
I think a follow-up PR for a read-only attribute, or at least some way to determine if it's been set (given that we can't make |
glyph
left a comment
There was a problem hiding this comment.
At the last possible second I had the thought that perhaps a better way to address this more flexibly would be to have a bodyParsers argument that actually contained a callable that could do something more pluggable here, and we could have a LEGACY_MULTIPART_PARSER object that did this?
But, in that spirit, and taking a bunch less work, maybe rename parseBody to parseFormSubmitPOSTs or something to highlight its highly-specific semantics?
In any case, I am OK with this as is; please add something to the narrative docs that explains this, but it can be a single paragraph, and merge at your discretion.
|
I don't think pluggable callables adds much, Tahoe-LAFS just overrides a method, can continue doing that. And it doesn't really enable an async API in any meaningful way. |
Scope and purpose
Fixes #12412
Gives users the ability to parse HTTP body uploads themselves without having to pay the costs of twisted.web's parsing of bodies.