Skip to content

Conversation

@nirs
Copy link
Contributor

@nirs nirs commented Nov 4, 2017

blocksize was hardcoded to 8192, preventing efficient upload when using
file-like body. Add blocksize argument to init, so users can
configure the blocksize to fit their needs.

I tested this uploading data from /dev/zero to a web server dropping the
received data, to test the overhead of the HTTPConnection.send() with a
file-like object.

Here is an example 10g upload with the default buffer size (8192):

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 17.53 seconds (584.00m/s)

real 0m17.574s
user 0m8.887s
sys 0m5.971s

Same with 512k blocksize:

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 6.60 seconds (1551.15m/s)

real 0m6.641s
user 0m3.426s
sys 0m2.162s

In real world usage the difference will be smaller, depending on the
local and remote storage and the network.

See https://github.com/nirs/http-bench for more info.

https://bugs.python.org/issue31945

blocksize was hardcoded to 8192, preventing efficient upload when using
file-like body. Add blocksize argument to __init__, so users can
configure the blocksize to fit their needs.

I tested this uploading data from /dev/zero to a web server dropping the
received data, to test the overhead of the HTTPConnection.send() with a
file-like object.

Here is an example 10g upload with the default buffer size (8192):

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 17.53 seconds (584.00m/s)

real	0m17.574s
user	0m8.887s
sys	0m5.971s

Same with 512k blocksize:

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 6.60 seconds (1551.15m/s)

real	0m6.641s
user	0m3.426s
sys	0m2.162s

In real world usage the difference will be smaller, depending on the
local and remote storage and the network.

See https://github.com/nirs/http-bench for more info.
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to write a test for it, maybe using a mock socket?

(if it is not given, the global default timeout setting is used).
The optional *source_address* parameter may be a tuple of a (host, port)
to use as the source address the HTTP connection is made from.
The optional *blocksize* parameter sets the buffer size used to send a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "in bytes" to size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reivew, will update in the next version.

I'll check how to add a test for this.

source_address=None, blocksize=8192):
self.timeout = timeout
self.source_address = source_address
self.blocksize = blocksize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest either name the attribute with an underscore (“_blocksize”) or document it as a public attribute.

Copy link
Contributor Author

@nirs nirs Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll document that this is a public attribute, since a user of the connection may like to modify it.

For example, user code may want to check if a connection supports changing the block size, and hasattr(conn, "blocksize") is nicer to use compared with checking __init__ signature.

@vadmium
Copy link
Member

vadmium commented Nov 5, 2017

For testing, just check the parameter passed to the read method, and use whatever socket or server the other tests use.

- Note that buffer size is "in bytes" (Victor)
- Document the new public "blocksize" attribute (Martin)
called.


.. attribute:: HTTPConnection.blocksize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a list of “four functions” for low-level requests. Perhaps move it up into the main list, and adjust or remove “the following methods”.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will move up before the low level method section.

nirs added 4 commits November 6, 2017 10:25
- Move blocksize attribute documentaion before the low level methods
  section (Martin)
Test that the send loop uses the configured block size when sending a
readable object.
We have two different code paths using blocksize - send() and
_read_readable(). Add a test for the second code path.
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for the added unit tests.

You might add an entry to Doc/whatsnew/3.7.rst

@@ -0,0 +1,2 @@
Add Configurable blocksize to ``HTTPConnection`` and ``HTTPSConnection``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose Add configurable *blocksize* parameter to (...).

@nirs
Copy link
Contributor Author

nirs commented Nov 6, 2017

You might add an entry to Doc/whatsnew/3.7.rst
@Haypo What kind of info should be in whatsnew?

@vstinner
Copy link
Member

vstinner commented Nov 6, 2017

@Haypo What kind of info should be in whatsnew?

Basically, copy/paste your NEWS entry in the "Improved Modules" section: it seems like you have to add a new "socket" sub-section.

@vstinner vstinner merged commit ad455cd into python:master Nov 6, 2017
@nirs nirs deleted the httpcon-blocksize branch November 6, 2017 21:52
embray pushed a commit to embray/cpython that referenced this pull request Nov 9, 2017
blocksize was hardcoded to 8192, preventing efficient upload when using
file-like body. Add blocksize argument to __init__, so users can
configure the blocksize to fit their needs.

I tested this uploading data from /dev/zero to a web server dropping the
received data, to test the overhead of the HTTPConnection.send() with a
file-like object.

Here is an example 10g upload with the default buffer size (8192):

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 17.53 seconds (584.00m/s)

real	0m17.574s
user	0m8.887s
sys	0m5.971s

Same with 512k blocksize:

$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 6.60 seconds (1551.15m/s)

real	0m6.641s
user	0m3.426s
sys	0m2.162s

In real world usage the difference will be smaller, depending on the
local and remote storage and the network.

See https://github.com/nirs/http-bench for more info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants