[release/8.0] Fix OOM in BigInteger OuterLoop tests causing SIGKILL on Linux#126011
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-numerics |
…lue/10 Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/3875817a-ab5d-4e0e-a906-e8d6e2162043
a69e62a to
058392a
Compare
|
@tannergooding I believe if you are OK with us taking this we need you to add the "Servicing approved" label (per the test only change backport flow) |
|
This is in 8.0 but it continues to cause noise as we continue to run tests there. |
There was a problem hiding this comment.
Pull request overview
This PR adjusts System.Runtime.Numerics.Tests BigInteger OuterLoop tests on release/8.0 to avoid extremely large BigInteger left-shifts that were triggering Linux OOM/SIGKILL in constrained/parallel test environments.
Changes:
- Reduce the “very large” shift size in
LargeValueLogTestsfromint.MaxValue / 10to1 << 24and update the expected-log calculation accordingly. - Reduce the “very large” shift size in
DoubleExplicitCastFromLargeBigIntegerTestsfromint.MaxValue / 10to1 << 24, preserving the infinity assertions while avoiding huge allocations.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/libraries/System.Runtime.Numerics/tests/BigInteger/log.cs | Lowers per-iteration BigInteger growth in large Log tests to prevent OOM while keeping assertions meaningful. |
| src/libraries/System.Runtime.Numerics/tests/BigInteger/cast_from.cs | Lowers per-iteration BigInteger growth in large double cast tests to prevent OOM while still producing ±Infinity. |
src/libraries/System.Runtime.Numerics/tests/BigInteger/cast_from.cs
Outdated
Show resolved
Hide resolved
|
@copilot address feedback |
…or bit shift Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/763c2d49-b5a3-4483-ae1e-484fe34e0b4f
Done in cabb434. I've addressed all four review items:
|
|
@copilot revert back to the first commit. I commented on the wrong one -- for backports we don't want changes to the code. |
…ommit Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d2473d7c-9ab4-431e-bd7d-948a40220e50
Remove the 4 comment-only changes, keeping only the 3 code lines that actually fix the OOM (int.MaxValue / 10 -> 1 << 24). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Minimizing the change further. |
tannergooding
left a comment
There was a problem hiding this comment.
No issues with a test fix. Don't think I'm the right person to approve its merge into net8 though.
|
According to https://github.com/dotnet/runtime/blob/main/docs/project/library-servicing.md#approval-process as area owner you need to apply the servicing approved label |
|
Looks like it also needs sign-off from @jeffhandley (engineering lead) first. |
|
/ba-g ComInterfaceGenerator timed out. |
main PR
Description
System.Runtime.Numerics.Testsouterloop tests were crashing with exit code 137 (SIGKILL) on Linux becauseLargeValueLogTestsandDoubleExplicitCastFromLargeBigIntegerTestsshiftedBigIntegerbyint.MaxValue / 10bits (~214M bits) per iteration in nested loops (up to 4×3 = 12 times), creating values of ~107MB each. With 2 parallel test threads in a Docker container, this reliably triggered the OOM killer.On
main, PR #102874 addedBigInteger.MaxLengthwhich causes these shifts to throwOverflowExceptioninstead of allocating.release/8.0has no such cap, so allocations succeed and exhaust container memory.Changes:
log.cs/LargeValueLogTests: Replaceint.MaxValue / 10with1 << 24in both the shift operation and the expected log value calculation. Test correctness is preserved — the values are still well above theBigInteger.Logprecision threshold.cast_from.cs/DoubleExplicitCastFromLargeBigIntegerTests: Replaceint.MaxValue / 10with1 << 24.2^(1<<24)still far exceedsdouble.MaxValue ≈ 2^1024, so infinity assertions remain valid.Peak per-iteration allocation drops from ~107MB to ~8MB, eliminating the OOM in constrained container environments.
Customer Impact
Persistent 100% failure rate of
System.Runtime.Numerics.Testsouterloop on all four Linux legs (x64, arm64, musl-x64, musl-arm64) in every scheduledrelease/8.0build. Known issue hit count: 56 times in one month.Regression
No — this is a long-standing test design issue exposed by container memory limits, not a regression introduced in a recent release.
Testing
Test-only changes. The fix reduces allocation sizes while keeping the test assertions semantically correct. The library itself is not modified.
Risk
Very low. Changes are confined to test helper functions; no product code is touched. The assertions remain correct with smaller (but still large) values.
Package authoring signed off?
IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.
This is a test-only change; no NuGet package authoring is required.
📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.