Skip to content

refactor: change extra_field from Arc<Vec<u8>> to Arc<[u8]>#741

Merged
Pr0methean merged 7 commits into
zip-rs:masterfrom
lovasoa:extra-field-arc-slice
Apr 1, 2026
Merged

refactor: change extra_field from Arc<Vec<u8>> to Arc<[u8]>#741
Pr0methean merged 7 commits into
zip-rs:masterfrom
lovasoa:extra-field-arc-slice

Conversation

@lovasoa

@lovasoa lovasoa commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

ZipFileData currently stores extra_field and central_extra_field as Option<Arc<Vec<u8>>>. This requires two pointer dereferences to reach the data (Arc ptr → Vec ptr → heap buffer), and the Vec's 3-word header (ptr, len, capacity) wastes space since the data is never resized after construction.

This changes the fields to Option<Arc<[u8]>>, which stores the byte slice inline in the Arc allocation. One indirection instead of two, and one usize saved per field (no capacity).

The AES extra data mutation path (update_aes_extra_data) can no longer use Arc::make_mut since [u8] is unsized. Instead it copies to a Vec, mutates, and converts back. I think it's acceptable since it only runs once during AES encryption setup on the write path.

This is a breaking change since the fields are exposed publicly

Problem: ZipFileData stores extra_field and central_extra_field as
Option<Arc<Vec<u8>>>. This requires two pointer dereferences to
reach the data (Arc ptr → Vec ptr → heap buffer), and the Vec's
3-word header (ptr, len, capacity) wastes space since the data is
never resized after construction.

Fix: change to Option<Arc<[u8]>>, which stores the byte slice
inline in the Arc allocation. One fewer indirection, and 16 bytes
saved per field (no Vec header or capacity).

The AES extra data mutation path (update_aes_extra_data) can no
longer use Arc::make_mut since [u8] is unsized. Instead it copies
to a Vec, mutates, and converts back — acceptable since it only
runs once during AES encryption setup.

Measured on bazel.jar (23,631 Stored entries):
  ZipArchive::new() net memory per entry:
    before: 490 bytes/entry
    after:  482 bytes/entry (-1.6%)
  (allocation count unchanged at 4.0/entry — this change reduces
  per-allocation overhead, not allocation count)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the storage of extra data fields within ZipFileData by changing their underlying type from Arc<Vec<u8>> to Arc<[u8]>. This modification enhances performance by reducing pointer dereferences and improves memory usage by eliminating the overhead of Vec's internal structure. The change primarily affects how these byte slices are managed and accessed, with a minor adjustment to a specific mutation path for AES extra data.

Highlights

  • Memory Optimization: Refactored extra_field and central_extra_field in ZipFileData from Option<Arc<Vec<u8>>> to Option<Arc<[u8]>> to improve memory efficiency.
  • Performance Improvement: Reduced pointer dereferences from two to one for accessing extra field data, leading to better performance.
  • AES Extra Data Handling: Adjusted the update_aes_extra_data function to temporarily convert Arc<[u8]> to Vec<u8> for mutation, then back to Arc<[u8]>, as Arc::make_mut is not applicable to unsized types.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

@Pr0methean

Copy link
Copy Markdown
Member

/gemini review

Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
Comment thread src/read.rs Fixed
Comment thread src/read.rs Fixed
Comment thread src/write.rs Fixed
@Pr0methean Pr0methean modified the milestone: 8.4.0 Apr 1, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes memory usage in the ZipFileData struct by replacing Arc<Vec<u8>> with Arc<[u8]> for extra fields, thereby eliminating a level of pointer indirection. The implementation involves updating several files to use Arc::from and boxed slices during field initialization and modification. I have no feedback to provide.

@Pr0methean Pr0methean added this to the 8.5.0 milestone Apr 1, 2026
Comment thread src/read.rs Fixed
@Pr0methean Pr0methean marked this pull request as ready for review April 1, 2026 02:18
@Pr0methean Pr0methean enabled auto-merge April 1, 2026 02:18
@Its-Just-Nans

Copy link
Copy Markdown
Member

I thought this would be a MAJOR breaking
but apparently not

@Pr0methean

Copy link
Copy Markdown
Member

Maybe there's a special exception for switching between Vec and slice, just because Vec is so rarely used otherwise than through AsRef, AsMut, Index or IndexMut on imported structs.

@Pr0methean Pr0methean added this pull request to the merge queue Apr 1, 2026
Merged via the queue into zip-rs:master with commit 1043e92 Apr 1, 2026
133 checks passed
@Pr0methean Pr0methean mentioned this pull request Apr 1, 2026
@lovasoa lovasoa deleted the extra-field-arc-slice branch April 1, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants