Skip to content

Reduce io-threads modifiability test iterations under Valgrind#3980

Merged
sarthakaggarwal97 merged 3 commits into
valkey-io:unstablefrom
sarthakaggarwal97:deflake-io-threads-valgrind-timeout
Jun 24, 2026
Merged

Reduce io-threads modifiability test iterations under Valgrind#3980
sarthakaggarwal97 merged 3 commits into
valkey-io:unstablefrom
sarthakaggarwal97:deflake-io-threads-valgrind-timeout

Conversation

@sarthakaggarwal97

Copy link
Copy Markdown
Contributor

The test io-threads are runtime modifiable test in tests/unit/other.tcl times out on the dedicated Valgrind jobs of the daily CI, failing the run. The failing test is introduced in #3938.

This PR reduces the loop to 10 iterations under Valgrind.

Failure links:

The 'test io-threads are runtime modifiable' test in unit/other toggles
io-threads 100 times. Each toggle tears down and respawns real pthreads
(pthread_create on grow, pthread_cancel + pthread_join on shrink) plus a
drainIOThreadsQueue() pass, which is far slower under Valgrind.

With the IO-threads redesign in place, the full run no longer fits in the
test timeout on the dedicated Valgrind jobs. The same timeout appeared
while the redesign was originally on unstable, went away when it was
reverted, and returned once it was relanded.

Reduce the loop to 10 iterations under Valgrind while keeping the full
100 in normal runs, preserving memcheck coverage of the thread
spawn/teardown path.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b609820c-537d-4f4a-bec0-88f4e22e05c3

📥 Commits

Reviewing files that changed from the base of the PR and between 62b7ff3 and d593da9.

📒 Files selected for processing (2)
  • tests/support/server.tcl
  • tests/test_helper.tcl

📝 Walkthrough

Walkthrough

Test server spawn notifications are restructured to pass log file paths, enabling test_helper to track and dump server logs on timeout for diagnostics. Additionally, the io-threads runtime modifiability test uses Valgrind-aware iteration count to reduce execution time under Valgrind.

Changes

Server Log Diagnostics on Timeout

Layer / File(s) Summary
Structured server-spawned payload and handler update
tests/support/server.tcl, tests/test_helper.tcl
spawn_server sends server-spawned as a list containing PID, stdout log path, and curfile instead of a single formatted string. The test_helper message handler parses this list to extract and record the log path.
Log path tracking and timeout diagnostics
tests/test_helper.tcl
New global array ::server_logs maps server PIDs to log file paths. New dump_server_logs procedure iterates active servers and prints the last 100 lines of each log for diagnostics. Timeout handler calls dump_server_logs before terminating servers.

IO-threads Test Performance Optimization

Layer / File(s) Summary
Valgrind-aware iteration count for io-threads test
tests/unit/other.tcl
The io-threads runtime modifiability test now uses a conditional iteration count: 10 iterations under Valgrind, 100 otherwise, replacing the hardcoded loop count.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • enjoy-binbin
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: reducing test iterations under Valgrind to address timeout issues.
Description check ✅ Passed The description clearly explains the problem (test timeout under Valgrind), the solution (reducing iterations to 10), and provides evidence with three CI failure links.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sarthakaggarwal97 sarthakaggarwal97 added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jun 12, 2026
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.91%. Comparing base (436dcae) to head (bb20d24).

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3980      +/-   ##
============================================
+ Coverage     76.75%   76.91%   +0.15%     
============================================
  Files           162      162              
  Lines         81017    81017              
============================================
+ Hits          62187    62311     +124     
+ Misses        18830    18706     -124     

see 25 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Did you run daily on your branch to check that it fixes it?

Comment thread tests/unit/other.tcl
When a test times out, the harness killed the server without ever
printing its log, making timeouts hard to troubleshoot. The orchestrator
only tracked server pids, not their log paths.

Include the server's stdout log path in the server-spawned message and
track it per-pid in the orchestrator. On timeout, dump the tail of each
still-running server's log before the servers are killed.

Addresses review feedback on valkey-io#3980.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
@sarthakaggarwal97 sarthakaggarwal97 moved this to To be backported in Valkey 9.1 Jun 24, 2026
@sarthakaggarwal97 sarthakaggarwal97 merged commit 79bca53 into valkey-io:unstable Jun 24, 2026
76 of 77 checks passed
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 25, 2026
The `test io-threads are runtime modifiable` test in
`tests/unit/other.tcl` times out on the dedicated Valgrind jobs of the
daily CI, failing the run. The failing test is introduced in #3938.

This PR reduces the loop to 10 iterations under Valgrind.

**Failure links:**
- https://github.com/valkey-io/valkey/actions/runs/27386948127 (Jun 12)
- https://github.com/valkey-io/valkey/actions/runs/27315974006 (Jun 11)
- https://github.com/valkey-io/valkey/actions/runs/27245311034 (Jun 10)

---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

Status: To be backported

Development

Successfully merging this pull request may close these issues.

4 participants