Skip to content

Appending to zip file with non-UTF-8 filenames corrupts the central directory #361

@j-browne

Description

@j-browne

Describe the bug
I was trying to append to a zip file that contained files with non-UTF-8 filenames. The zip file was created using 7-Zip on Window 10. When trying to append to the zip file, the resulting file is invalid. Different programs exhibit different behavior when trying to read/extract this file.

To Reproduce
I've attached a minimal example ó.zip, which contains a file named ó.txt (that is 0xa2, not 0xc3 0xb3, in case that ends up getting converted).
The following program creates a new zip file if it does not exist or appends to it if it does exist:

use std::{
    env,
    fs::{File, OpenOptions},
    io::copy,
    path::PathBuf,
};
use zip::{write::SimpleFileOptions, ZipWriter};

fn main() {
    let mut args = env::args().skip(1);
    let zip_path = PathBuf::from(args.next().unwrap());
    let file_path = PathBuf::from(args.next().unwrap());

    println!("Adding {file_path:?} to {zip_path:?}");

    let mut zip = {
        let appending = zip_path.exists();
        let file = OpenOptions::new()
            .create(true)
            .truncate(false)
            .read(true)
            .write(true)
            .open(zip_path)
            .unwrap();
        if appending {
            ZipWriter::new_append(file).unwrap()
        } else {
            ZipWriter::new(file)
        }
    };

    let options = SimpleFileOptions::default();
    let stripped_path = file_path.strip_prefix(file_path.parent().unwrap()).unwrap();
    zip.start_file_from_path(stripped_path, options).unwrap();
    let mut file = File::open(file_path).unwrap();
    copy(&mut file, &mut zip).unwrap();
}

Steps to reproduce the behavior:

  1. Run cargo run -- ó.zip a.txt

Expected behavior
The resulting file is expected to be a valid zip file that can be read, extracted, appended to, etc. by any zip program or library.

Different programs exhibit different behavior when trying to read/extract this file:

  • zip is unable to read the file. For example, trying to add another file by running the program again results in ZipWriter::new_append returning InvalidArchive("Could not find EOCD").

  • Unzipping with 7-Zip gives the following error:

    Headers Error
    Warnings:
    There are some data after the end of the payload data
    

    It does, however, unzip it, but it only contains ó.txt.

  • Extracting with Windows Explorer works, and the extracted files are correct.

Additional context
Everything works as expected if the file is named a.txt instead of ó.txt.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Amazon Q development agentGenerate new features or iterate code based on issue descriptions and comments.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions