Conversation
There was a problem hiding this comment.
Pull request overview
Updates Adler32Tests to validate the actual overflow boundary condition tied to Adler32’s delayed-modulo optimization (NMax), ensuring the test fails if NMax is reduced below the safe threshold.
Changes:
- Reworks
LargeInput_ExceedsNMaxto prime the hasher into a maximal checksum state, then append max-value bytes to stress 32-bit intermediate overflow behavior. - Updates expected checksum values to match the new (stronger) test construction.
|
Tagging subscribers to this area: @dotnet/area-system-io-hashing, @bartonjs, @vcsjones |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/libraries/System.IO.Hashing/tests/Adler32Tests.cs:241
- This change removes
AllMaxBytes_MatchesReference, which was the only place in this file that validated one-shotAdler32.HashToUInt32against the reference implementation for an all-0xFF input across a range of lengths (including small vector-friendly sizes). If the intent isn’t to drop that coverage, consider keeping an all-0xFF reference comparison (either by restoring the test or folding a few representative 0xFF cases into an existing theory).
/// <summary>
/// Tests incremental appending with various chunk sizes to verify that the
/// vectorized paths produce the same result regardless of how data is fed in.
/// </summary>
| // Starting from an already-maxed checksum, a stream of 5553 max value | ||
| // bytes will overflow if not reduced by mod 65521 before the last byte. |
There was a problem hiding this comment.
The comment refers specifically to a stream of "5553" max-value bytes overflowing, but this theory also runs with length=11104. To keep the test documentation accurate, update the wording to refer to the parameter (e.g., "a stream of length max-value bytes" / explain why 5553 is the critical boundary case).
| // Starting from an already-maxed checksum, a stream of 5553 max value | |
| // bytes will overflow if not reduced by mod 65521 before the last byte. | |
| // Starting from an already-maxed checksum, a stream of length max-value | |
| // bytes will overflow if not reduced by mod 65521 before the last byte. | |
| // 5553 is the smallest such length; 11104 is a larger stress case. |
| public void LargeInput_ExceedsNMax(int length, uint expected) | ||
| { | ||
| byte[] data = new byte[length]; | ||
| for (int i = 0; i < data.Length; i++) | ||
| { | ||
| data[i] = (byte)('a' + (i % 26)); | ||
| } | ||
|
|
||
| Assert.Equal(expected, Adler32.HashToUInt32(data)); | ||
| // This test ensures that Adler32 optimizations involving delayed modulo | ||
| // do not overflow a 32-bit intermediate at any point. | ||
|
|
||
| var alg = new Adler32(); |
There was a problem hiding this comment.
LargeInput_ExceedsNMax now relies on first priming the instance to a maxed checksum state before appending the length bytes. The current name suggests the test is only about input size, but the setup (maxed starting state + 0xFF data) is the key condition being validated; consider renaming to reflect that so future readers don’t accidentally “simplify” the setup and lose the boundary coverage.
|
I'm going to put up a PR with a new vectorized Adler32 implementation. Will include the test changes in that one. |
The existing test doesn't check the actual boundary condition. For example, if
NMaxis changed to8192in the Adler32 implementation, the tests still pass.This change correctly breaks if
NMaxis set as small as5553.