Skip to content

feat: Implement by_path*() methods on ZipArchive#382

Merged
Pr0methean merged 8 commits intozip-rs:masterfrom
neumann-paulmann:by_path
Sep 5, 2025
Merged

feat: Implement by_path*() methods on ZipArchive#382
Pr0methean merged 8 commits intozip-rs:masterfrom
neumann-paulmann:by_path

Conversation

@neumann-paulmann
Copy link
Contributor

Fixes #381.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @neumann-paulmann, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the zip crate by adding new capabilities to the ZipArchive struct, allowing users to access files within a ZIP archive directly by their path. This includes support for both unencrypted and password-protected (ZipCrypto) entries, streamlining file retrieval operations. The changes are accompanied by new tests to ensure correctness.

Highlights

  • New by_path_decrypt method: I've introduced a new public method by_path_decrypt to the ZipArchive struct. This method allows users to retrieve a ZipFile entry by its path within the archive, and it attempts to decrypt the file using a provided password. It includes important warnings regarding the security limitations of the underlying ZipCrypto algorithm.
  • New by_path method: I've also added a by_path method to ZipArchive. This method provides a straightforward way to retrieve a ZipFile entry by its path without attempting decryption, serving as a convenient counterpart to the existing by_index method.
  • New Unit Tests: Comprehensive unit tests have been added in a new file tests/by_path.rs to validate the functionality of both the by_path and by_path_decrypt methods, ensuring they correctly retrieve and, where applicable, decrypt file contents.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the by_path and by_path_decrypt methods to the ZipArchive struct, allowing for file retrieval using paths. The implementation is well-structured and includes comprehensive tests. Refactoring the test file to reduce code duplication would improve maintainability.

tests/by_path.rs Outdated
Comment on lines +12 to +27
fn by_path() {
let mut buf = Vec::new();
let mut zip = ZipWriter::new(Cursor::new(&mut buf));
let options = SimpleFileOptions::default().compression_method(zip::CompressionMethod::Stored);
zip.add_directory(DIRECTORY_NAME, options).unwrap();
zip.start_file(format!("{DIRECTORY_NAME}/{FILE_NAME}"), options)
.unwrap();
zip.write_all(LOREM_IPSUM).unwrap();
zip.finish().unwrap();

let mut archive = ZipArchive::new(Cursor::new(&mut buf)).unwrap();
let path = Path::new(DIRECTORY_NAME).join(FILE_NAME);
let mut file = archive.by_path(path).unwrap();
let mut file_buf = Vec::new();
file.read_to_end(&mut file_buf).unwrap();
assert_eq!(LOREM_IPSUM, file_buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The by_path test function can be simplified by extracting the zip archive creation and file content verification into reusable helper functions to reduce code duplication and improve readability.

fn create_test_zip() -> Vec<u8> {
    let mut buf = Vec::new();
    let mut zip = ZipWriter::new(Cursor::new(&mut buf));
    let options = SimpleFileOptions::default().compression_method(zip::CompressionMethod::Stored);
    zip.add_directory(DIRECTORY_NAME, options).unwrap();
    zip.start_file(format!("{DIRECTORY_NAME}/{FILE_NAME}"), options)
        .unwrap();
    zip.write_all(LOREM_IPSUM).unwrap();
    zip.finish().unwrap();
    buf
}

fn verify_file_content(mut file: ZipFile) {
    let mut file_buf = Vec::new();
    file.read_to_end(&mut file_buf).unwrap();
    assert_eq!(LOREM_IPSUM, file_buf);
}

#[test]
fn by_path() {
    let mut buf = create_test_zip();
    let mut archive = ZipArchive::new(Cursor::new(&mut buf)).unwrap();
    let path = Path::new(DIRECTORY_NAME).join(FILE_NAME);
    let file = archive.by_path(path).unwrap();
    verify_file_content(file);
}

Comment on lines +31 to +48
fn by_path_decrypt() {
let mut buf = Vec::new();
let mut zip = ZipWriter::new(Cursor::new(&mut buf));
let options = SimpleFileOptions::default()
.compression_method(zip::CompressionMethod::Stored)
.with_aes_encryption(AesMode::Aes128, PASSWORD);
zip.add_directory(DIRECTORY_NAME, options).unwrap();
zip.start_file(format!("{DIRECTORY_NAME}/{FILE_NAME}"), options)
.unwrap();
zip.write_all(LOREM_IPSUM).unwrap();
zip.finish().unwrap();

let mut archive = ZipArchive::new(Cursor::new(&mut buf)).unwrap();
let path = Path::new(DIRECTORY_NAME).join(FILE_NAME);
let mut file = archive.by_path_decrypt(path, PASSWORD.as_bytes()).unwrap();
let mut file_buf = Vec::new();
file.read_to_end(&mut file_buf).unwrap();
assert_eq!(LOREM_IPSUM, file_buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The by_path_decrypt test function can be simplified by extracting the zip archive creation and file content verification into reusable helper functions to reduce code duplication and improve readability.

fn create_test_zip_encrypted() -> Vec<u8> {
    let mut buf = Vec::new();
    let mut zip = ZipWriter::new(Cursor::new(&mut buf));
    let options = SimpleFileOptions::default()
        .compression_method(zip::CompressionMethod::Stored)
        .with_aes_encryption(AesMode::Aes128, PASSWORD);
    zip.add_directory(DIRECTORY_NAME, options).unwrap();
    zip.start_file(format!("{DIRECTORY_NAME}/{FILE_NAME}"), options)
        .unwrap();
    zip.write_all(LOREM_IPSUM).unwrap();
    zip.finish().unwrap();
    buf
}

fn verify_file_content_encrypted(mut file: ZipFile) {
    let mut file_buf = Vec::new();
    file.read_to_end(&mut file_buf).unwrap();
    assert_eq!(LOREM_IPSUM, file_buf);
}

#[test]
fn by_path_decrypt() {
    let mut buf = create_test_zip_encrypted();
    let mut archive = ZipArchive::new(Cursor::new(&mut buf)).unwrap();
    let path = Path::new(DIRECTORY_NAME).join(FILE_NAME);
    let file = archive.by_path_decrypt(path, PASSWORD.as_bytes()).unwrap();
    verify_file_content_encrypted(file);
}

@neumann-paulmann
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces by_path and by_path_decrypt methods to ZipArchive, allowing file lookup by Path instead of just by string name. The main feedback is to refactor the new methods to reuse the existing by_name_with_optional_password helper to reduce code duplication and improve maintainability.

Comment on lines +1032 to +1040
pub fn by_path_decrypt<T: AsRef<Path>>(
&mut self,
path: T,
password: &[u8],
) -> ZipResult<ZipFile<'_, R>> {
self.index_for_path(path)
.ok_or(ZipError::FileNotFound)
.and_then(|index| self.by_index_with_optional_password(index, Some(password)))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This implementation can be simplified by reusing the existing by_name_with_optional_password helper function. This will reduce code duplication and make the implementation more concise. by_name_with_optional_password already handles finding the index and returning ZipError::FileNotFound if not found.

        self.by_name_with_optional_password(&path_to_string(path), Some(password))

Comment on lines +1042 to +1047
/// Search for a file entry by path
pub fn by_path<T: AsRef<Path>>(&mut self, path: T) -> ZipResult<ZipFile<'_, R>> {
self.index_for_path(path)
.ok_or(ZipError::FileNotFound)
.and_then(|index| self.by_index_with_optional_password(index, None))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to by_path_decrypt, this implementation can be simplified by reusing the existing by_name_with_optional_password helper function. This improves consistency and reduces code duplication.

        self.by_name_with_optional_password(&path_to_string(path), None)

@Pr0methean Pr0methean enabled auto-merge September 4, 2025 19:07
…est that uses them, since they're not used when the `aes-crypto` feature isn't enabled
@Pr0methean Pr0methean added this pull request to the merge queue Sep 5, 2025
Merged via the queue into zip-rs:master with commit 68882b2 Sep 5, 2025
62 of 65 checks passed
@Pr0methean Pr0methean mentioned this pull request Sep 5, 2025
cosmicexplorer pushed a commit to cosmicexplorer/zip2 that referenced this pull request Sep 30, 2025
* Implement by_path*() methods.

* Refactor by_path tests.

* fix: test requires aes-crypto

* fix: Move `use zip::AesMode` and declaration of `PASSWORD` into the test that uses them, since they're not used when the `aes-crypto` feature isn't enabled

---------

Co-authored-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
Co-authored-by: hennickc <hennickc@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement retrieval of ZIP file entries by std::path::Path

2 participants