Clarify streaming response body behavior in SPEC#1748
Clarify streaming response body behavior in SPEC#1748tenderlove merged 3 commits intorack:masterfrom
Conversation
Strings must be processed individually as they are yielded by `each`. However, if the Body responds to `to_ary` it can be implicitly coerced to an Array, which may then be processed all at once.
Apply to `ContentLength` and `ETag` middleware.
|
Note: for Rails compatibility, this change will require a corresponding change in Rails's |
jeremyevans
left a comment
There was a problem hiding this comment.
In terms of the idea, I think it is reasonable. It's conservative in the sense that if the body is already an array, its size is known and fixed. This does fix cases where middleware that ship with Rack will buffer when buffering is not be desireable. The downside is the breakage of backwards compatibility in cases where the current behavior of the middleware handling non-Array responses is desired. However, for a major version change, I guess that is acceptable.
In terms of the implementation, I think the wording should be more specific. The phrases must be processed individually as they are yielded and may be processed all at once are too vague IMO. I'm guessing the first phrase means by the web server and the second phrase means by middleware, but that's not obvious. Maybe something like:
Middleware must not call +each+ directly on the Body. Middleware can
return a new Body which has an +each+ method that calls +each+ on the
returned Body.
If the Body responds to +to_ary+, middleware can call +to_ary+ on the
Body, can call +each+ on the resulting array, and can return the resulting
array as the new Body. In other words, middleware can only process the Body
before returning if the body responds to +to_ary+.
Second, SPEC.rdoc should not be modified directly. Instead lib/rack/lint.rb should be modified to add appropriately formatted comments, and rake spec should be run to regenerate SPEC.rdoc.
Finally, the class-level comments for the ContentLength and ETag middleware should be modified to explicitly state that they only set the headers if the body responds to to_ary. I suppose this isn't necessary if that's stated in SPEC.rdoc, but I think being more specific would help.
|
I'd consider downgrading to a "should" -- if someone wants to create a CacheAllBodiesEvenBigOnes middleware, to me that should be "consciously deviating from the recommended path" not "breaking the letter of the spec". I also don't think we should actually call the detected I noodled with a few possible phrasings, but couldn't come up with a compelling specific suggestion that conveys "[if you replace the body, and you are consuming the original] you should yield approximately as often [or at least not less often] than the body you're wrapping; if that wrapped body responds to I think we might need a BodyProxy subclass that defines |
Added the following Lint errors: - "Middleware must not call #each directly" - "New body must yield at least once per iteration of old body" - "Body has not been closed" - "#to_ary not identical to contents produced by calling #each"
|
Thanks all for the comments and feedback so far! @jeremyevans thanks for suggesting more specific wording (and for reminding me that the spec was generated from the lint comments!).
@matthewd thanks for your detailed comments as well.
I did another pass on the proposed spec changes while trying to consider all the comments raised. It's quite a bit more verbose, but less ambiguous: Note also that the final paragraph replaces the existing paragraph on |
|
This looks great to me, the clarity is excellent. I really like the direction here. |
|
I'm mostly OK with the revised language, except that I think we should change: to: We shouldn't describe how Body objects must work internally, only the interface that middleware should use to interact with them. I think we should also change: to: The reason for this change is to make it clear that you cannot operate on the Body itself just because the body responds to if body.respond_to?(:to_ary) && body.respond_to?(:[])
body[0] = 'foo'
endYou have to do instead: if body.respond_to?(:to_ary)
body = body.to_ary
body[0] = 'foo'
endAdditionally, you already mentioned that Finally, we should probably use a separate heading for this new section such as |
|
@jeremyevans thanks again for your comments and for taking a close look at the proposed language. I think we're getting very close! Agree with leaving out the line about Also agree with saying that middleware "must use the result of It might be hard to forbid code like
I actually expect servers to use For now, here's another draft with the changes mentioned: |
Can confirm. |
|
@wjordan I love the work you are doing here. It is clear to me that you (and everyone else) is drawing on a huge level of knowledge and the result is a clearer, more precise and easier to implement SPEC. Thanks so much and I look forward to when we can merge this! |
|
I don't see this in the changelog. In which version is this included? |
@swrobel This wasn't released yet and you can find it only in |
Thanks for clarifying @simi! I do see there's a changelog section for 3.0 unreleased items, so it seems it would be helpful to have there. |
|
@swrobel This is already present in CHANGELOG: |
This PR stems from the discussion in #1745 concerning how to distinguish between streaming and buffered bodies in the Rack interface. The proposal here adds the following two sentences to the specification:
I consider this more of a 'clarification' than a breaking change to the interface, as previous discussions and implementations in the Rack project have suggested that treating
Arraybodies (or objects coercible toArrayvia#to_ary) as fixed-length content, in contrast to any other response bodies which may be possibly-streaming, indeterminate content, had been implicit in the interface since at least Dec 2008 (7a0476b).For example, see #166 (comment) from May 2011:
This PR also updates
ContentLengthandETagmiddleware to demonstrate how this spec clarification can guide a more consistent implementation of 'buffering' middleware/servers in practice. There are a few steps:body.respond_to?(:to_ary), and skips the buffering/transformation logic if the response is not coercible toArraybody = body.to_ary, which coerces the response body to ensure that it is anArrayofStrings.Some notes:
BodyProxyis no longer needed in middleware using this pattern, becauseArrayobjects never respond toclose. (If#closeis required by an object that can be coerced to anArray, it should close as part of itsto_aryimplementation. For example, see the update to the 'close bodies that need to be closed' content length spec test.)#write), to reduce the number of socket writes.respond_to?(:ary)inContentLength. Previously, theto_arycoercion was never actually performed on the body before eager-loading, it was assumed the content was already buffered as long as therespond_to?was satisfied. However, this allowed for overridingto_aryto return anything, evennil(see e.g., the workaround in Add support for Rack::ContentLength middleware rails/rails#16793), just to trigger the eager-loading behavior without actually producing the requiredArrayofStrings representing fully-buffered content. In this proposal (and implementation), the coercion is actually performed which prevents this kind of confusion.Rack::Lintare included to help enforce this new clarification.