gh-140135: use PyBytesWriter in io.RawIOBase.readall; 3.98x faster#140139
gh-140135: use PyBytesWriter in io.RawIOBase.readall; 3.98x faster#140139vstinner merged 6 commits intopython:mainfrom
PyBytesWriter in io.RawIOBase.readall; 3.98x faster#140139Conversation
PyBytesWriter in io.RawIOBase.readall, 2.34x fasterPyBytesWriter in io.RawIOBase.readall
PyBytesWriter in io.RawIOBase.readallPyBytesWriter in io.RawIOBase.readall; 3.98x faster
vstinner
left a comment
There was a problem hiding this comment.
Please add a NEWS entry, IMO this speedup is significant enough to be documented!
|
Impressive speedup! |
|
The CI failed. I'm not exactly sure why. The same jobs succeeded before:
No meaningful changes since the last run (b8a7f89 is just style, and f0ae824 just adds a NEWS entry.) From what I see, some flakiness was reported: I'm rerunning by merging the |
|
I think this looks good, the only other thing is I would verify this doesn't regress performance for sizes <1MB (e.g. 1K and 4K maybe). I expect the performance differences to be much smaller there if at all present, but that's OK, we just want to make sure they aren't regressing. |
Great point. The benchmark (the same, just tweaked `SIZES_KB`):import io
import pyperf
CHUNK_SIZE = 1024
SIZES_KB = [1, 4, 8]
class ChunkedRaw(io.RawIOBase):
def __init__(self, data, chunk_size):
self._buf = memoryview(data)
self._pos = 0
self._chunk_size = chunk_size
def readable(self):
return True
def read(self, n: int = -1):
if self._pos >= len(self._buf):
return b""
to_read = (
self._chunk_size
if (n is None or n < 0)
else min(n, self._chunk_size)
)
end = min(self._pos + to_read, len(self._buf))
out = bytes(self._buf[self._pos : end])
self._pos = end
return out
def generate_bytes(total):
block = b"abcdefghijklmnopqrstuvwxyz0123456789" * 128
return (block * (total // len(block) + 1))[:total]
def _bench_readall(data, chunk_size):
r = ChunkedRaw(data, chunk_size)
out = r.readall()
if len(out) != len(data):
raise RuntimeError("what is going on...???")
def main():
runner = pyperf.Runner()
for size_kib in SIZES_KB:
total_bytes = size_kib * 1024
data = generate_bytes(total_bytes)
name = f"rawiobase_readall_{size_kib}KB_chunk{CHUNK_SIZE}"
runner.bench_func(name, _bench_readall, data, CHUNK_SIZE)
if __name__ == "__main__":
main()The result:
|
The issue #140135 provides more details.
Benchmark
The script:
The results (with
--rigorous):The environment:
sudo ./python -m pyperf system tuneensured.PyBytesWriterAPI inio.RawIOBase.readall#140135