-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
bpo-31945: Configurable blocksize in HTTP(S)Connection #4279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
blocksize was hardcoded to 8192, preventing efficient upload when using file-like body. Add blocksize argument to __init__, so users can configure the blocksize to fit their needs. I tested this uploading data from /dev/zero to a web server dropping the received data, to test the overhead of the HTTPConnection.send() with a file-like object. Here is an example 10g upload with the default buffer size (8192): $ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/ Uploaded 10.00g in 17.53 seconds (584.00m/s) real 0m17.574s user 0m8.887s sys 0m5.971s Same with 512k blocksize: $ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/ Uploaded 10.00g in 6.60 seconds (1551.15m/s) real 0m6.641s user 0m3.426s sys 0m2.162s In real world usage the difference will be smaller, depending on the local and remote storage and the network. See https://github.com/nirs/http-bench for more info.
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to write a test for it, maybe using a mock socket?
Doc/library/http.client.rst
Outdated
| (if it is not given, the global default timeout setting is used). | ||
| The optional *source_address* parameter may be a tuple of a (host, port) | ||
| to use as the source address the HTTP connection is made from. | ||
| The optional *blocksize* parameter sets the buffer size used to send a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add "in bytes" to size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reivew, will update in the next version.
I'll check how to add a test for this.
| source_address=None, blocksize=8192): | ||
| self.timeout = timeout | ||
| self.source_address = source_address | ||
| self.blocksize = blocksize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest either name the attribute with an underscore (“_blocksize”) or document it as a public attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll document that this is a public attribute, since a user of the connection may like to modify it.
For example, user code may want to check if a connection supports changing the block size, and hasattr(conn, "blocksize") is nicer to use compared with checking __init__ signature.
|
For testing, just check the parameter passed to the read method, and use whatever socket or server the other tests use. |
- Note that buffer size is "in bytes" (Victor) - Document the new public "blocksize" attribute (Martin)
Doc/library/http.client.rst
Outdated
| called. | ||
|
|
||
|
|
||
| .. attribute:: HTTPConnection.blocksize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in a list of “four functions” for low-level requests. Perhaps move it up into the main list, and adjust or remove “the following methods”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will move up before the low level method section.
- Move blocksize attribute documentaion before the low level methods section (Martin)
Test that the send loop uses the configured block size when sending a readable object.
We have two different code paths using blocksize - send() and _read_readable(). Add a test for the second code path.
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for the added unit tests.
You might add an entry to Doc/whatsnew/3.7.rst
| @@ -0,0 +1,2 @@ | |||
| Add Configurable blocksize to ``HTTPConnection`` and ``HTTPSConnection`` | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose Add configurable *blocksize* parameter to (...).
|
Basically, copy/paste your NEWS entry in the "Improved Modules" section: it seems like you have to add a new "socket" sub-section. |
blocksize was hardcoded to 8192, preventing efficient upload when using file-like body. Add blocksize argument to __init__, so users can configure the blocksize to fit their needs. I tested this uploading data from /dev/zero to a web server dropping the received data, to test the overhead of the HTTPConnection.send() with a file-like object. Here is an example 10g upload with the default buffer size (8192): $ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/ Uploaded 10.00g in 17.53 seconds (584.00m/s) real 0m17.574s user 0m8.887s sys 0m5.971s Same with 512k blocksize: $ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/ Uploaded 10.00g in 6.60 seconds (1551.15m/s) real 0m6.641s user 0m3.426s sys 0m2.162s In real world usage the difference will be smaller, depending on the local and remote storage and the network. See https://github.com/nirs/http-bench for more info.
blocksize was hardcoded to 8192, preventing efficient upload when using
file-like body. Add blocksize argument to init, so users can
configure the blocksize to fit their needs.
I tested this uploading data from /dev/zero to a web server dropping the
received data, to test the overhead of the HTTPConnection.send() with a
file-like object.
Here is an example 10g upload with the default buffer size (8192):
$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 17.53 seconds (584.00m/s)
real 0m17.574s
user 0m8.887s
sys 0m5.971s
Same with 512k blocksize:
$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 6.60 seconds (1551.15m/s)
real 0m6.641s
user 0m3.426s
sys 0m2.162s
In real world usage the difference will be smaller, depending on the
local and remote storage and the network.
See https://github.com/nirs/http-bench for more info.
https://bugs.python.org/issue31945