perf: skip BufReader for Stored files in make_reader by lovasoa · Pull Request #739 · zip-rs/zip2

lovasoa · 2026-03-20T14:12:23Z

Currently, by_index() allocates an 8 KiB BufReader and a Decompressor for every entry, even when CompressionMethod is Stored (when decompression is a no-op pass-through). For a 23K-entry JAR (mostly Stored files), this means 2 heap allocations per call totaling 8,632 bytes each — 47,262 allocations just to iterate and drop.

Fix: add a ZipFileReader::Stored variant that wraps CryptoReader directly in Crc32Reader, bypassing both BufReader and Decompressor.

This required adding a Debug impl for CryptoReader since the new Stored variant needs it for the derive(Debug) on ZipFileReader.

Measured on bazel.jar (23,631 Stored entries):

  by_index() allocations:
    before: 2.0 allocs/entry, 8,632 bytes alloc'd each
    after:  1.0 allocs/entry,   256 bytes alloc'd each
  cargo bench read_entry (1 MiB Stored file):
    before: 106,657 ns/iter (9,831 MB/s)
    after:  101,276 ns/iter (10,353 MB/s, +5.3%)

Problem: by_index() allocates an 8 KiB BufReader and a Decompressor for every entry, even CompressionMethod::Stored where decompression is a no-op pass-through. For a 23K-entry JAR (mostly Stored files), this means 2 heap allocations per call totaling 8,632 bytes each — 47,262 allocations just to iterate and drop. Fix: add a ZipFileReader::Stored variant that wraps CryptoReader directly in Crc32Reader, bypassing both BufReader and Decompressor. This required adding a Debug impl for CryptoReader since the new Stored variant needs it for the derive(Debug) on ZipFileReader. Measured on bazel.jar (23,631 Stored entries): by_index() allocations: before: 2.0 allocs/entry, 8,632 bytes alloc'd each after: 1.0 allocs/entry, 256 bytes alloc'd each cargo bench read_entry (1 MiB Stored file): before: 106,657 ns/iter (9,831 MB/s) after: 101,276 ns/iter (10,353 MB/s, +5.3%) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-03-20T14:12:39Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the handling of uncompressed (Stored) files within the ZIP archive reading logic. By introducing a dedicated reader path for these files, it eliminates redundant buffering and decompression steps, leading to more efficient memory usage and faster data access. The changes are focused on optimizing resource consumption when dealing with files that do not require actual decompression.

Highlights

Performance Optimization for Stored Files: Introduced a specialized ZipFileReader::Stored variant to bypass unnecessary BufReader and Decompressor allocations for files with CompressionMethod::Stored. This significantly reduces memory allocations and slightly improves read performance for uncompressed entries.
Reduced Allocations: The change reduces allocations from 2.0 allocs/entry (8,632 bytes each) to 1.0 allocs/entry (256 bytes each) for by_index() operations on Stored files.
Improved Read Speed: Benchmarking shows a 5.3% improvement in read speed for 1 MiB Stored files, from 106,657 ns/iter to 101,276 ns/iter.
Debug Implementation: Added a Debug implementation for CryptoReader to support the derive(Debug) on ZipFileReader after the introduction of the new Stored variant.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist · 2026-03-20T14:21:37Z

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

Its-Just-Nans · 2026-03-20T17:29:04Z

Hi

Thanks for the MR

I'm curious, which tools do you use to get this result?

  by_index() allocations:
    before: 2.0 allocs/entry, 8,632 bytes alloc'd each
    after:  1.0 allocs/entry,   256 bytes alloc'd each

lovasoa · 2026-03-20T22:07:06Z

My harness is here: https://github.com/lovasoa/zip2/blob/reduce-allocations/examples%2Fmemory_bench_alloc.rs

Pr0methean · 2026-03-22T06:59:05Z

/Gemini review

gemini-code-assist

Code Review

This pull request enhances the ZipFileReader in src/read.rs by introducing explicit handling for Stored (uncompressed) data. A new Stored variant is added to the ZipFileReader enum, along with corresponding implementations for the Read trait methods (read, read_exact, read_to_end, read_to_string) and the into_inner method. The make_reader function is updated to correctly construct a ZipFileReader::Stored when the compression method is Stored. Additionally, a Debug implementation was added for the CryptoReader enum to facilitate debugging.

Pr0methean · 2026-03-22T07:01:44Z

@gemini-code-assist Will this cause a performance regression when reading from a file?

gemini-code-assist · 2026-03-22T07:03:54Z

Hi @Pr0methean,

That's a great question! The benchmark results provided in the pull request description show a 5.3% improvement in read speed for 1 MiB Stored files. This suggests that the changes should not cause a performance regression when reading from a file, and may even improve it in some cases.

However, it's important to consider that these benchmarks are just one data point, and performance may vary depending on the specific file being read, the hardware being used, and other factors. It would be helpful to run additional benchmarks on a wider variety of files and systems to get a more complete picture of the performance impact of these changes.

Additionally, it's worth noting that the primary goal of this pull request is to reduce memory allocations, which can have a positive impact on overall performance even if read speed is not significantly affected. The reduced allocations can lead to less garbage collection overhead and improved memory locality, which can improve performance in memory-constrained environments.

Pr0methean · 2026-03-22T07:13:43Z

@gemini-code-assist Can this change increase the number of I/O syscalls?

gemini-code-assist · 2026-03-22T07:15:53Z

Hi @Pr0methean,

That's another excellent question! Whether this change increases the number of I/O syscalls depends on a few factors:

Buffering: The original code uses BufReader, which performs its own internal buffering. By removing BufReader for stored files, we might reduce buffering in some cases, potentially leading to more frequent, smaller reads from the underlying CryptoReader.
Read Sizes: If the read sizes requested by the user are smaller than the internal buffer size of BufReader, then removing it could lead to more syscalls. Conversely, if the read sizes are larger, the impact might be negligible or even beneficial.
Underlying Reader: The CryptoReader itself might perform buffering, depending on its implementation and the underlying reader it wraps. If CryptoReader already buffers efficiently, the impact of removing BufReader might be minimal.

To get a definitive answer, it would be necessary to:

Profile the code: Use profiling tools to measure the number of syscalls before and after the change, under various scenarios (different file sizes, read sizes, etc.).
Analyze CryptoReader: Examine the implementation of CryptoReader to understand its buffering behavior.

In summary, it's possible that this change could increase the number of I/O syscalls in certain situations, but it's not guaranteed. Profiling and further analysis are needed to determine the actual impact.

lovasoa marked this pull request as ready for review March 20, 2026 15:44

move Debug defition after type and add debug information for Aes

f81b7b9

cargo fmt

3d1fc47

Its-Just-Nans self-requested a review March 20, 2026 17:31

Its-Just-Nans added this to the 8.4.0 milestone Mar 20, 2026

Its-Just-Nans approved these changes Mar 20, 2026

View reviewed changes

Its-Just-Nans requested a review from Pr0methean March 21, 2026 22:49

gemini-code-assist Bot reviewed Mar 22, 2026

View reviewed changes

Pr0methean added this pull request to the merge queue Mar 22, 2026

Merged via the queue into zip-rs:master with commit c2999bc Mar 22, 2026
130 checks passed

Pr0methean mentioned this pull request Mar 22, 2026

chore: release v8.4.0 #746

Merged

lovasoa deleted the skip-bufreader-stored branch March 22, 2026 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: skip BufReader for Stored files in make_reader#739

perf: skip BufReader for Stored files in make_reader#739
Pr0methean merged 3 commits into
zip-rs:masterfrom
lovasoa:skip-bufreader-stored

lovasoa commented Mar 20, 2026

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Uh oh!

Its-Just-Nans commented Mar 20, 2026

Uh oh!

lovasoa commented Mar 20, 2026

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot commented Mar 22, 2026

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lovasoa commented Mar 20, 2026

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Uh oh!

Its-Just-Nans commented Mar 20, 2026

Uh oh!

lovasoa commented Mar 20, 2026

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot commented Mar 22, 2026

Uh oh!

Pr0methean commented Mar 22, 2026

Uh oh!

gemini-code-assist Bot commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants