Skip to content

Fix OOM aborts in large-memory ASAN tests on GitHub runners#3263

Merged
enjoy-binbin merged 5 commits into
valkey-io:unstablefrom
rainsupreme:asan-large-memory
Mar 12, 2026
Merged

Fix OOM aborts in large-memory ASAN tests on GitHub runners#3263
enjoy-binbin merged 5 commits into
valkey-io:unstablefrom
rainsupreme:asan-large-memory

Conversation

@rainsupreme

Copy link
Copy Markdown
Contributor

Carries on from where #3161 left off. The test-sanitizer-address-large-memory jobs were being OOM-killed on GitHub-hosted runners (15.6GB RAM) due to ASAN's 2-3x memory overhead.

Changes:

  • Skip 4GB quicklist compression test under ASAN (requires ~16-24GB with dual buffers + ASAN overhead)
  • Reduce integration test sizes from 5GB to 4.1GB (preserves >4GB 32-bit boundary coverage)
  • Reduce XADD iterations from 10 to 3
  • Add memory monitoring to track minimum free memory during CI runs

Results:

Test run: https://github.com/rainsupreme/valkey/actions/runs/22474045103/job/65097138093
Before: ~19GB peak → OOM killed (exceeded 15.6GB)
After: ~12GB peak → 3.6GB free at minimum

@rainsupreme

rainsupreme commented Feb 27, 2026

Copy link
Copy Markdown
Contributor Author

a somewhat more extensive run (still queued as I type): https://github.com/rainsupreme/valkey/actions/runs/22476202835

(Will retry when the unstable branch daily workflow is fixed)

Comment thread src/unit/test_quicklist.cpp Outdated

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will wait for the latest successful run after gtest fixes, but LGTM! Thanks @rainsupreme

@rainsupreme

Copy link
Copy Markdown
Contributor Author

Rebased and tested again: https://github.com/rainsupreme/valkey/actions/runs/22528917037/job/65265240022

GCC version says

=== Memory Summary ===
Total RAM: 15.6GB
Minimum free memory: 2.5GB

Clang version says

=== Memory Summary ===
Total RAM: 15.6GB
Minimum free memory: 2.3GB

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, though I think 4GiB > 4.1GB

Comment thread tests/unit/violations.tcl Outdated
Comment thread tests/unit/violations.tcl Outdated
Signed-off-by: Rain Valentine <rsg000@gmail.com>
… runners

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rsg000@gmail.com>
@codecov

codecov Bot commented Mar 6, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.90%. Comparing base (8c397ea) to head (2872a9a).
⚠️ Report is 22 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3263      +/-   ##
============================================
- Coverage     74.98%   74.90%   -0.09%     
============================================
  Files           129      129              
  Lines         71551    71551              
============================================
- Hits          53654    53593      -61     
- Misses        17897    17958      +61     

see 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Nikhil-Manglore Nikhil-Manglore left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for fixing this!

@zuiderkwast zuiderkwast added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Mar 6, 2026
@enjoy-binbin enjoy-binbin merged commit c9ce3e0 into valkey-io:unstable Mar 12, 2026
100 checks passed
JimB123 pushed a commit that referenced this pull request Mar 19, 2026
Carries on from where #3161 left off. The test-sanitizer-address-large-memory
jobs were being OOM-killed on GitHub-hosted runners (15.6GB RAM) due to
ASAN's 2-3x memory overhead.

Changes:
- Skip 4GB quicklist compression test under ASAN (requires ~16-24GB with
dual buffers + ASAN overhead)
- Reduce integration test sizes from 5GB to 4.1GB (preserves >4GB 32-bit
boundary coverage)
- Reduce XADD iterations from 10 to 3
- Add memory monitoring to track minimum free memory during CI runs

Signed-off-by: Rain Valentine <rsg000@gmail.com>
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 16, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml (weekly uses unstable's workflow), test_quicklist.cpp
(large memory quicklist test doesn't exist on 8.1).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 16, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml, test_quicklist.cpp (not applicable to 8.0).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 27, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml (weekly uses unstable's workflow), test_quicklist.cpp
(large memory quicklist test doesn't exist on 8.1).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 60c86e1)
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 27, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml, test_quicklist.cpp (not applicable to 8.0).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 068362a)
sarthakaggarwal97 pushed a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 27, 2026
…o#3263)

Carries on from where valkey-io#3161 left off. The test-sanitizer-address-large-memory
jobs were being OOM-killed on GitHub-hosted runners (15.6GB RAM) due to
ASAN's 2-3x memory overhead.

Changes:
- Skip 4GB quicklist compression test under ASAN (requires ~16-24GB with
dual buffers + ASAN overhead)
- Reduce integration test sizes from 5GB to 4.1GB (preserves >4GB 32-bit
boundary coverage)
- Reduce XADD iterations from 10 to 3
- Add memory monitoring to track minimum free memory during CI runs

Signed-off-by: Rain Valentine <rsg000@gmail.com>
(cherry picked from commit c9ce3e0)
sarthakaggarwal97 pushed a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 28, 2026
…o#3263)

Carries on from where valkey-io#3161 left off. The test-sanitizer-address-large-memory
jobs were being OOM-killed on GitHub-hosted runners (15.6GB RAM) due to
ASAN's 2-3x memory overhead.

Changes:
- Skip 4GB quicklist compression test under ASAN (requires ~16-24GB with
dual buffers + ASAN overhead)
- Reduce integration test sizes from 5GB to 4.1GB (preserves >4GB 32-bit
boundary coverage)
- Reduce XADD iterations from 10 to 3
- Add memory monitoring to track minimum free memory during CI runs

Signed-off-by: Rain Valentine <rsg000@gmail.com>
(cherry picked from commit c9ce3e0)
sarthakaggarwal97 added a commit to sarthakaggarwal97/valkey that referenced this pull request May 3, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml, test_quicklist.cpp (not applicable to 8.0).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 068362a)
madolson pushed a commit that referenced this pull request May 6, 2026
Partial cherry-pick of c9ce3e0 from unstable.
Applied: list.tcl, set.tcl, violations.tcl (reduce test sizes for CI runners).
Skipped: daily.yml, test_quicklist.cpp (not applicable to 8.0).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 068362a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants