Fix potential TCP deadlock in module datatype defrag test#15274
Merged
Conversation
sundb
reviewed
Jun 1, 2026
Comment on lines
+155
to
+169
| # Use batching to avoid TCP deadlock when deferred replies accumulate. | ||
| set batch_size 1000 | ||
| for {set i 0} {$i < $n} {incr i} { | ||
| $rd datatype.set k$i 1 $dummy | ||
|
|
||
| if {($i + 1) % $batch_size == 0} { | ||
| for {set j 0} {$j < $batch_size} {incr j} { | ||
| $rd read | ||
| } | ||
| } | ||
| } | ||
| set remaining [expr {$n % $batch_size}] | ||
| for {set i 0} {$i < $remaining} {incr i} { | ||
| $rd read | ||
| } |
Collaborator
There was a problem hiding this comment.
i'm thinking maybe we can add a general proc for this
# Send `count` commands on a deferring client `rd`, reading replies in batches
# of `batch_size` to avoid a TCP deadlock when deferred replies accumulate.
# `cmd_script` is evaluated in the caller's scope once per iteration with the
# variable `i` (0-based index) available, and must send exactly one command.
proc deferred_batch {rd count cmd_script {batch_size 1000}} {
set pending 0
for {set idx 0} {$idx < $count} {incr idx} {
uplevel 1 [list set i $idx]
uplevel 1 $cmd_script
incr pending
if {$pending == $batch_size} {
for {set i 0} {$i < $pending} {incr i} { $rd read }
set pending 0
}
}
for {set i 0} {$i < $pending} {incr i} { $rd read }
}
Contributor
Author
|
New commit CI: https://github.com/vitahlin/redis/actions/runs/26738251458 |
sundb
approved these changes
Jun 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
The module datatype defrag test sends 20k commands through a deferred client before reading any replies. On slower CI environments this can cause replies to accumulate and fill TCP/socket buffers, leading to flaky
I/O error reading replyfailures.Failed daily CI: https://github.com/redis/redis/actions/runs/26483280016/job/77985080308
Change
Fix by batching deferred writes and reply drains, following the same approach used in #14886.
Test
Daily CI with command:
--loops 200 --single unit/moduleapi/datatype: https://github.com/vitahlin/redis/actions/runs/26511634526/job/78077483092Note
Low Risk
Test-only TCL changes; no production server or module behavior is modified.
Overview
Adds a shared
deferred_batchhelper intests/support/util.tclthat pipelines commands on a deferring Redis client and drains replies every 1000 commands (configurable), so large batches do not fill TCP buffers and deadlock.The module datatype active-defrag test now uses this helper for its ~20k
datatype.setwrites and ~10kdelpasses instead of sending all commands before reading any replies—addressing flakyI/O error reading replyfailures on slow CI, in line with the approach from #14886.Reviewed by Cursor Bugbot for commit f2b9dcd. Bugbot is set up for automated code reviews on this repo. Configure here.