Skip to content

Fix potential TCP deadlock in module datatype defrag test#15274

Merged
sundb merged 2 commits into
redis:unstablefrom
vitahlin:fx-datatype
Jun 2, 2026
Merged

Fix potential TCP deadlock in module datatype defrag test#15274
sundb merged 2 commits into
redis:unstablefrom
vitahlin:fx-datatype

Conversation

@vitahlin

@vitahlin vitahlin commented May 27, 2026

Copy link
Copy Markdown
Contributor

Issue

The module datatype defrag test sends 20k commands through a deferred client before reading any replies. On slower CI environments this can cause replies to accumulate and fill TCP/socket buffers, leading to flaky I/O error reading reply failures.

Failed daily CI: https://github.com/redis/redis/actions/runs/26483280016/job/77985080308

Change

Fix by batching deferred writes and reply drains, following the same approach used in #14886.

Test

Daily CI with command: --loops 200 --single unit/moduleapi/datatype: https://github.com/vitahlin/redis/actions/runs/26511634526/job/78077483092


Note

Low Risk
Test-only TCL changes; no production server or module behavior is modified.

Overview
Adds a shared deferred_batch helper in tests/support/util.tcl that pipelines commands on a deferring Redis client and drains replies every 1000 commands (configurable), so large batches do not fill TCP buffers and deadlock.

The module datatype active-defrag test now uses this helper for its ~20k datatype.set writes and ~10k del passes instead of sending all commands before reading any replies—addressing flaky I/O error reading reply failures on slow CI, in line with the approach from #14886.

Reviewed by Cursor Bugbot for commit f2b9dcd. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread tests/unit/moduleapi/datatype.tcl Outdated
Comment on lines +155 to +169
# Use batching to avoid TCP deadlock when deferred replies accumulate.
set batch_size 1000
for {set i 0} {$i < $n} {incr i} {
$rd datatype.set k$i 1 $dummy

if {($i + 1) % $batch_size == 0} {
for {set j 0} {$j < $batch_size} {incr j} {
$rd read
}
}
}
set remaining [expr {$n % $batch_size}]
for {set i 0} {$i < $remaining} {incr i} {
$rd read
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking maybe we can add a general proc for this

# Send `count` commands on a deferring client `rd`, reading replies in batches
# of `batch_size` to avoid a TCP deadlock when deferred replies accumulate.
# `cmd_script` is evaluated in the caller's scope once per iteration with the
# variable `i` (0-based index) available, and must send exactly one command.
proc deferred_batch {rd count cmd_script {batch_size 1000}} {
    set pending 0
    for {set idx 0} {$idx < $count} {incr idx} {
        uplevel 1 [list set i $idx]
        uplevel 1 $cmd_script
        incr pending
        if {$pending == $batch_size} {
            for {set i 0} {$i < $pending} {incr i} { $rd read }
            set pending 0
        }
    }
    for {set i 0} {$i < $pending} {incr i} { $rd read }
}

@vitahlin

vitahlin commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@sundb sundb merged commit 148de48 into redis:unstable Jun 2, 2026
18 checks passed
@vitahlin vitahlin deleted the fx-datatype branch June 2, 2026 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants