Skip to content

[DRAFT][WIP]Add Zip Archives password support#122093

Draft
alinpahontu2912 wants to merge 60 commits intodotnet:mainfrom
alinpahontu2912:zip_password
Draft

[DRAFT][WIP]Add Zip Archives password support#122093
alinpahontu2912 wants to merge 60 commits intodotnet:mainfrom
alinpahontu2912:zip_password

Conversation

@alinpahontu2912
Copy link
Member

@alinpahontu2912 alinpahontu2912 commented Dec 2, 2025

Fixes #1545

Big Milestones and status:

  • Read Encrypted Entries (ZipCrypto, WinZip AES)
  • Create Encrypted Entries (ZipCrypto, Winzip AES)
  • Update Mode
  • Add runtime assets for checking compat with WinRar/7zip/WinZip/etc
  • Security check for Cryptography methods
  • Final API design

@rzikm rzikm self-requested a review December 11, 2025 12:22
Copilot AI review requested due to automatic review settings February 6, 2026 12:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.

Copilot AI review requested due to automatic review settings February 9, 2026 13:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 6 comments.

Copilot AI review requested due to automatic review settings February 13, 2026 10:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 8 comments.

Copilot AI review requested due to automatic review settings February 16, 2026 11:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.

Copy link
Member

@bartonjs bartonjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General feedback:

  • Secrets (including an input password) should never be copied to the heap, unless it's utterly impractical.
  • Secrets (including an input password) should also only minimally be copied around on the stack.
  • Copied secrets should be cleared.
  • Avoid having cryptographic keys (HMAC, AES, etc) stored in arrays.

GitHub's current UI is making this change hard to holistically review, and it's marked as draft, so I've only sprinkled in the most important comments.

uint key2 = 878082192;

// ZipCrypto uses raw bytes; ASCII is the most interoperable
var bytes = Encoding.ASCII.GetBytes(password.ToArray());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

password.ToArray() is bad, you're making an uncleared copy of the secret.

Since you have a streaming protocol, you can just use a stackalloc buffer and walk it pieces at a time. (Especially because I think you're only on .NET "Core")

const int StackAllocSize = 64;
const int SegmentSize = StackAllocSize / 2;
Debug.Assert(Encoding.ASCII.GetMaxByteCount(SegmentSize) == StackAllocSize);

Span<byte> buf = stackalloc byte[64];
ReadOnlySpan<byte> pwSpan = password.Span;

while (!pwSpan.IsEmpty)
{
    ReadOnlySpan<byte> segment = pwSpan;

    if (segment.Length > SegmentSize)
    {
        segment = segment.Slice(0, SegmentSize);
    }

    int byteCount = Encoding.ASCII.GetBytes(segment, buf);

    foreach (byte b in buf.Slice(0, byteCount))
    {
        UpdateKeys(ref key0, ref key1, ref key2, b);
    }

    pwSpan = pwSpan.Slice(segment.Length);
}

...

// Creates the persisted key bytes from a password.
// The returned byte array contains the 3 ZipCrypto keys (key0, key1, key2)
// serialized as 12 bytes in little-endian format.
public static byte[] CreateKey(ReadOnlyMemory<char> password)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's always a good idea to avoid public in non-public classes, since code reviewers then have to stop and think "is this externally callable?"

Suggested change
public static byte[] CreateKey(ReadOnlyMemory<char> password)
private static byte[] CreateKey(ReadOnlyMemory<char> password)

}

// Serialize the 3 keys to bytes in little-endian format
byte[] keyBytes = new byte[KeySize];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid making an array for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, it just gets packed into this array then unpacked into 3 integers later. Return a struct of 3 integers... try to keep the key off the heap.

// Creates the persisted key bytes from a password.
// The returned byte array contains the 3 ZipCrypto keys (key0, key1, key2)
// serialized as 12 bytes in little-endian format.
public static byte[] CreateKey(ReadOnlyMemory<char> password)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this taking ReadOnlyMemory when it could take ReadOnlySpan?

_key2 = BinaryPrimitives.ReadUInt32LittleEndian(keyBytes.AsSpan(8, 4));
}

private byte[] CalculateHeader()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning an array, can it be given the destination span to write to?

for (int i = 0; i < KeystreamBufferSize; i += BlockSize)
{
BinaryPrimitives.WriteUInt128LittleEndian(counterBlock, _counter);
_aesEncryptor.TransformBlock(counterBlock, 0, BlockSize, _keystreamBuffer, i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using _aes.EncryptEcb over an ICryptoTransform, it'll allow you to work with spans instead of arrays.

byte[] counterBlock = new byte[BlockSize];

// Generate KeystreamBufferSize bytes of keystream (256 blocks of 16 bytes each)
for (int i = 0; i < KeystreamBufferSize; i += BlockSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 256 calls to encrypt, you could

  • Fill the whole of _keyStreamBuffer with the counter values
  • Call Encrypt/Transform just once, writing in-place.

Comment on lines +415 to +423
private static void XorBytes(Span<byte> dest, ReadOnlySpan<byte> src)
{
Debug.Assert(dest.Length <= src.Length);

for (int i = 0; i < dest.Length; i++)
{
dest[i] ^= src[i];
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This undoubtedly could be vectorized to go faster, if you expect the operation to generally be bigger than a few bytes. Even just doing something like

Span<ulong> destUL = Marshal.As<byte, ulong>(dest);
ReadOnlySpan<ulong> srcUL = Marshal.As<byte, ulong>(src);

XorUL(destUL, srcUL);

for (int i = destUL.Length * sizeof(ulong); i < dest.Length; i++)
{
    ...
}

(Though I think if src/dest isn't QWORD aligned that might do something less happy on ARM?)

Copy link
Member

@rzikm rzikm Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be premature, but if we want to vectorize this, then approach with something like Vector<byte> might be better.

// Validate auth code immediately when we've read all encrypted data
if (_encryptedDataRemaining <= 0)
{
ValidateAuthCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this isn't the right approach... you've already written all of the data into the caller buffer before telling them it's bad. But the only way you can avoid that is to give up on streaming, and I assume you don't want to do that.

(With streaming enabled, even if you did buffer.Clear() on failure, someone could already have read all but the last byte on previous calls)

So, no action here, other than "be educated that this is a bad practice, and is why .NET's AES-GCM doesn't allow streaming decryption"

public static void ExtractToDirectory(this System.IO.Compression.ZipArchive source, string destinationDirectoryName) { }
public static void ExtractToDirectory(this System.IO.Compression.ZipArchive source, string destinationDirectoryName, bool overwriteFiles) { }
public static void ExtractToDirectory(this System.IO.Compression.ZipArchive source, string destinationDirectoryName, bool overwriteFiles, string password) { }
public static void ExtractToDirectory(this System.IO.Compression.ZipArchive source, string destinationDirectoryName, string password) { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many of these operations could take the password as a ReadOnlySpan (so that it could have been securely read into the stack, or native memory), or (at worst) a ReadOnlyMemory?

String is definitely friendlier, but being able to avoid a heap copy of a secret is a more secure design.

Copilot AI review requested due to automatic review settings March 5, 2026 11:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 9 comments.

Comment on lines +426 to +429
if (string.IsNullOrEmpty(password))
{
throw new ArgumentNullException(nameof(password), SR.EmptyPassword);
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overloads treat an empty password the same as null and throw ArgumentNullException. Since the parameter is non-nullable and empty is a different invalid value, this should be an ArgumentException (or ArgumentException.ThrowIfNullOrEmpty) rather than ArgumentNullException.

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +123
if (string.IsNullOrEmpty(password))
{
throw new ArgumentNullException(nameof(password), SR.EmptyPassword);
}

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overloads treat an empty password the same as null and throw ArgumentNullException. Since the parameter is non-nullable and empty is a different invalid value, this should be an ArgumentException (or ArgumentException.ThrowIfNullOrEmpty) rather than ArgumentNullException.

Suggested change
if (string.IsNullOrEmpty(password))
{
throw new ArgumentNullException(nameof(password), SR.EmptyPassword);
}
if (password is null)
{
throw new ArgumentNullException(nameof(password));
}
if (password.Length == 0)
{
throw new ArgumentException(SR.EmptyPassword, nameof(password));
}

Copilot uses AI. Check for mistakes.
Comment on lines +255 to +259
if (!IsEncrypted)
{
throw new InvalidDataException(SR.EntryNotEncrypted);
}
return await OpenInReadModeAsync(checkOpenable: true, cancellationToken, password.AsMemory()).ConfigureAwait(false);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAsync(string password, EncryptionMethod encryptionMethod) in Read mode currently ignores the fact that an encryption method was specified and proceeds to open for reading. This contradicts the sync Open(password, encryptionMethod) behavior (and the SR.EncryptionReadMode message), and makes the API inconsistent. In Read mode this overload should throw InvalidOperationException (SR.EncryptionReadMode), matching the other overloads.

Suggested change
if (!IsEncrypted)
{
throw new InvalidDataException(SR.EntryNotEncrypted);
}
return await OpenInReadModeAsync(checkOpenable: true, cancellationToken, password.AsMemory()).ConfigureAwait(false);
throw new InvalidOperationException(SR.EncryptionReadMode);

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +148
using (FileStream fs = new FileStream(extractPath, fileStreamOptions))
{
using (Stream es = !string.IsNullOrEmpty(password) ? source.Open(password) : source.Open())
es.CopyTo(fs);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The password overload falls back to source.Open()/OpenAsync() when the provided password is empty. Since the API contract is “extract decrypted using the specified password”, accepting an empty string and silently treating it as “no password” is inconsistent (and will also fail for encrypted entries). Prefer validating the password (throw for null/empty) and always calling the password-based Open/OpenAsync overload.

Copilot uses AI. Check for mistakes.
Comment on lines +312 to +350
<data name="InvalidWinZipSize" xml:space="preserve">
<value>Stream size is too small for WinZip standard</value>
</data>
<data name="InvalidPassword" xml:space="preserve">
<value>Invalid password</value>
</data>
<data name="WinZipAuthCodeMismatch" xml:space="preserve">
<value>Authentication code mismatch for WinZip encrypted entry.</value>
</data>
<data name="EntryNotEncrypted" xml:space="preserve">
<value>Entry is not encrypted.</value>
</data>
<data name="PasswordRequired" xml:space="preserve">
<value>A password is required for encrypted entries.</value>
</data>
<data name="EmptyPassword" xml:space="preserve">
<value>Empty password was provided for encrypting this entry</value>
</data>
<data name="InvalidAesStrength" xml:space="preserve">
<value>Invalid AES strength value.</value>
</data>
<data name="TruncatedZipCryptoHeader" xml:space="preserve">
<value>Truncated ZipCrypto header.</value>
</data>
<data name="CrcMismatch" xml:space="preserve">
<value>CRC mismatch.</value>
</data>
<data name="EncryptionNotSpecified" xml:space="preserve">
<value>Encryption type is not specified</value>
</data>
<data name="EncryptionReadMode" xml:space="preserve">
<value>Encryption Method should not be specified in Read Mode</value>
</data>
<data name="EncryptionUpdateMode" xml:space="preserve">
<value>Encryption Method should not be specified in Update Mode</value>
</data>
<data name="WinZipEncryptionNotSupportedOnBrowser" xml:space="preserve">
<value>WinZip encryption is not supported on browser</value>
</data>
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several newly added resource strings are inconsistent with existing message style: missing trailing periods, inconsistent capitalization (“Encryption Method”), and slightly ungrammatical phrasing (e.g., “on browser”). It would be better to align with the surrounding resources (sentence case + period) to keep exception messages consistent.

Copilot uses AI. Check for mistakes.

private void EnsureHeader()
{
WriteHeaderCore(isAsync: false).AsTask().GetAwaiter().GetResult();
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZipCryptoStream.EnsureHeader synchronously blocks on a ValueTask by calling AsTask().GetAwaiter().GetResult(), which allocates a Task unnecessarily. ValueTask supports GetAwaiter() directly; using that avoids the allocation on every write path.

Suggested change
WriteHeaderCore(isAsync: false).AsTask().GetAwaiter().GetResult();
WriteHeaderCore(isAsync: false).GetAwaiter().GetResult();

Copilot uses AI. Check for mistakes.
Comment on lines +323 to +334
EnsureHeader();

byte[] tmp = new byte[buffer.Length];
for (int i = 0; i < buffer.Length; i++)
{
byte ks = DecryptByte(_key2);
byte p = buffer[i];
tmp[i] = (byte)(p ^ ks);
UpdateKeys(ref _key0, ref _key1, ref _key2, p);
}
_base.Write(tmp, 0, tmp.Length);
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZipCryptoStream.Write allocates a new byte[] the size of the caller’s buffer on every call, which can be very expensive for typical stream copy buffer sizes (e.g., 80KB). Consider renting from ArrayPool (and returning in a finally) or chunking to a reusable buffer to avoid per-call large allocations.

Copilot uses AI. Check for mistakes.
Comment on lines +622 to +652
public override void Write(ReadOnlySpan<byte> buffer)
{
ThrowIfNotWritable();
if (!_headerWritten)
{
WriteHeader();
}

byte[] workBuffer = new byte[KeystreamBufferSize];
WriteCore(buffer, workBuffer);
}

public override Task WriteAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken)
{
ValidateBufferArguments(buffer, offset, count);
return WriteAsyncCore(buffer.AsMemory(offset, count), cancellationToken).AsTask();
}

private async ValueTask WriteAsyncCore(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
ThrowIfNotWritable();
if (!_headerWritten)
{
await WriteHeaderAsync(cancellationToken).ConfigureAwait(false);
}

int inputOffset = 0;
int inputCount = buffer.Length;
byte[] workBuffer = new byte[KeystreamBufferSize];

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WinZipAesStream allocates a new 4KB work buffer on every Write/WriteAsync call. For typical usage (CopyTo with many writes), this becomes a steady allocation stream. Consider using ArrayPool or a reusable instance buffer instead.

Copilot uses AI. Check for mistakes.
Comment on lines +798 to +816
public void WriteBlock(Stream stream)
{
Span<byte> buffer = new byte[TotalSize];
WriteBlockCore(buffer);
stream.Write(buffer);
}

public async Task WriteBlockAsync(Stream stream, CancellationToken cancellationToken = default)
{
byte[] buffer = new byte[TotalSize];
WriteBlockCore(buffer);
await stream.WriteAsync(buffer, cancellationToken).ConfigureAwait(false);
}

private void WriteBlockCore(Span<byte> buffer)
{
BinaryPrimitives.WriteUInt16LittleEndian(buffer.Slice(0), HeaderId);
BinaryPrimitives.WriteUInt16LittleEndian(buffer.Slice(2), 7); // DataSize
BinaryPrimitives.WriteUInt16LittleEndian(buffer.Slice(4), VendorVersion);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WinZipAesExtraField.WriteBlock allocates on the heap (new byte[TotalSize]) and WriteBlockCore uses a hard-coded size value (7) even though DataSize exists. Since TotalSize/DataSize are constants and the block is tiny, prefer stackalloc (or reuse DataSize) to avoid allocations and magic numbers.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add password to ZipArchive

6 participants