Skip to content

Fix RemoteArchive performance by adding bulk read methods#6377

Merged
pstreef merged 2 commits intomainfrom
fix/remote-archive-bulk-read
Dec 4, 2025
Merged

Fix RemoteArchive performance by adding bulk read methods#6377
pstreef merged 2 commits intomainfrom
fix/remote-archive-bulk-read

Conversation

@pstreef
Copy link
Copy Markdown
Contributor

@pstreef pstreef commented Dec 4, 2025

Problem

RemoteArchive.getInputStream() creates InputStream wrappers that only override the single-byte read() method. When RemoteArtifactCache copies the downloaded archive to disk using Files.copy(), it falls back to single-byte reads instead of using bulk read(byte[], int, int).

For large archives like the Gradle distribution (129MB), this results in ~135 million individual read calls instead of ~2000 bulk reads, causing a 665x slowdown.

Solution

Add bulk read method read(byte[], int, int) to both InputStream wrappers:

  1. The Content-Length validation wrapper during HTTP download
  2. The ZipInputStream wrapper in readIntoArchive()

Benchmark results

Tested with gradle-9.2.1-bin.zip (129MB):

  • Cache copy with bulk read: ~100ms
  • Cache copy without bulk read: ~66,600ms (665x slower)

@github-project-automation github-project-automation bot moved this from In Progress to Ready to Review in OpenRewrite Dec 4, 2025
@pstreef pstreef marked this pull request as ready for review December 4, 2025 14:06
@timtebeek timtebeek added enhancement New feature or request performance labels Dec 4, 2025
InputStream wrappers only overrode read(), causing Files.copy() to
fall back to single-byte reads. For large archives (e.g. 129MB Gradle
distribution), this resulted in ~135 million read calls instead of
~2000 bulk reads - a 665x slowdown.
@pstreef pstreef force-pushed the fix/remote-archive-bulk-read branch from 5e84009 to d101c3b Compare December 4, 2025 14:38
@pstreef pstreef merged commit bbd37e2 into main Dec 4, 2025
1 of 2 checks passed
@pstreef pstreef deleted the fix/remote-archive-bulk-read branch December 4, 2025 15:19
@github-project-automation github-project-automation bot moved this from Ready to Review to Done in OpenRewrite Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request performance

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants