[s3] Optimize forward seeks within buffered data to avoid redundant GET#892
Merged
[s3] Optimize forward seeks within buffered data to avoid redundant GET#892
Conversation
6c5e136 to
0aec1d7
Compare
0aec1d7 to
95bc7ed
Compare
ddelange
added a commit
that referenced
this pull request
Oct 11, 2025
…o chunked_s3 * 'develop' of https://github.com/piskvorky/smart_open: Optimize forward seeks within buffered data to avoid redundant GET (#892) Add macos to CI (#891)
ddelange
added a commit
that referenced
this pull request
Oct 12, 2025
…o fix-ssh * 'develop' of https://github.com/piskvorky/smart_open: (66 commits) Optimize forward seeks within buffered data to avoid redundant GET (#892) Add macos to CI (#891) Simplify CI, use uv (#890) [s3] Improve handling of InvalidRange and seek on empty file (#889) Protect against hanging tests (#888) Bump the github-actions group with 2 updates (#886) build: fix invalid `fallback_version` when builing with `uv` (#884) Remove travis leftover (#881) Disambiguate URI examples in README.rst (#879) Update CHANGELOG.md Add .xz and increase performance of compression module (#875) Bump pypa/gh-action-pypi-publish in /.github/workflows (#878) Bump actions/checkout from 4 to 5 in the github-actions group (#877) Fix release.sh for the final merge back into develop (#872) Update CHANGELOG.md Drop 3.7 support in pyproject.toml (#871) Fix CI badge (#869) Bump softprops/action-gh-release in the github-actions group (#867) Fix release.sh merge message and final merge (#868) Update CHANGELOG.md ...
This was referenced Oct 12, 2025
ddelange
added a commit
that referenced
this pull request
Oct 20, 2025
* develop: Update CHANGELOG.md Use compression.zstd (PEP-784) (#895) Drop python 3.8, add python 3.14 (#896) [s3] Add range_chunk_size param to read using multiple GET requests (#887) Run tests in parallel (#893) Optimize forward seeks within buffered data to avoid redundant GET (#892) Add macos to CI (#891) Simplify CI, use uv (#890) [s3] Improve handling of InvalidRange and seek on empty file (#889) Protect against hanging tests (#888) Bump the github-actions group with 2 updates (#886) build: fix invalid `fallback_version` when builing with `uv` (#884) Remove travis leftover (#881) Disambiguate URI examples in README.rst (#879)
|
Released v7.4.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Ref #712, fix #622, fix #742
Before this change, any seek operation would close the current HTTP connection and open a new GET request to S3, even when seeking forward by a small amount within already-buffered data.
This commit adds an optimization to Reader.seek() that checks if a forward seek can be satisfied from the existing buffer. If the target position falls within the currently buffered range, the implementation simply advances the buffer position without making a new S3 request.
Backward seeks and forward seeks beyond the buffer still require new GET requests as before.
Tests
Work in progress
Checklist
python update_helptext.pyin case there are API changesWorkflow