Skip to content

Fix corruption of large files when zip64 is used#113

Merged
mrkkrp merged 1 commit intomasterfrom
issue-111
Apr 20, 2024
Merged

Fix corruption of large files when zip64 is used#113
mrkkrp merged 1 commit intomasterfrom
issue-111

Conversation

@mrkkrp
Copy link
Copy Markdown
Owner

@mrkkrp mrkkrp commented Apr 19, 2024

Close #111.

Previously the code did not account for the fact that the initial stub local header (with uncompressed and compressed sizes set to 0) could not serve for correct estimation of the final local header size due to the fact that the local header size was determined by the uncompressed and compressed sizes of the corresponding data, which are only known after streaming of the data. These sizes dictated whether or not a zip64 extra field entry should be included in the header or not. Thus, before this fix there would be cases of corruption where the final (longer) local header written by seeking back to the beginning of the initial stub local header after the data had been streamed would overwrite the beginning of the data.

This is fixed by

  • always writing a zip64 entry in local headers, which does not violate the spec and will be safely ignored in the case of smaller entries, and
  • respecting the spec more precisely where it says that whenever there is a zip64 extra field entry in a local header both uncompressed and compressed sizes must always be written.

This is deemed safe because the only source of size variation for local headers is the uncompressed and compressed sizes of the corresponding data.

Previously the code did not account for the fact that the initial stub local
header (with uncompressed and compressed sizes set to 0) could not serve for
correct estimation of the final local header size due to the fact that the
local header size was determined by the uncompressed and compressed sizes of
the corresponding data, which are only known after streaming of the data.
These sizes dictated whether or not a zip64 extra field entry should be
included in the header or not. Thus, before this fix there would be cases of
corruption where the final (longer) local header written by seeking back to
the beginning of the initial stub local header after the data had been
streamed would overwrite the beginning of the data.

This is fixed by

* always writing a zip64 entry in local headers, which does not violate the
  spec and will be safely ignored in the case of smaller entries, and
* respecting the spec more precisely where it says that whenever there is a
  zip64 extra field entry in a local header both uncompressed and compressed
  sizes must always be written.

This is deemed safe because the only source of size variation for local
headers is the uncompressed and compressed sizes of the corresponding data.
@mrkkrp mrkkrp merged commit da5df36 into master Apr 20, 2024
@mrkkrp mrkkrp deleted the issue-111 branch April 20, 2024 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to extract 4GB+ files from archive.

1 participant