Skip to content

Fix buffered_reply assert in HFE commands with module keyspace notifications#3743

Merged
ranshid merged 3 commits into
valkey-io:unstablefrom
enjoy-binbin:fix_notify
May 18, 2026
Merged

Fix buffered_reply assert in HFE commands with module keyspace notifications#3743
ranshid merged 3 commits into
valkey-io:unstablefrom
enjoy-binbin:fix_notify

Conversation

@enjoy-binbin

Copy link
Copy Markdown
Member

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand, hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

…cations

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand, hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in valkey-io#1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin enjoy-binbin requested a review from ranshid May 18, 2026 05:40
@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 945fd333-5507-4730-9a75-79a1e3da786e

📥 Commits

Reviewing files that changed from the base of the PR and between efef7f4 and dab0478.

📒 Files selected for processing (1)
  • tests/unit/moduleapi/block_keyspace_notification.tcl

📝 Walkthrough

Walkthrough

Three hash commands (HGETDEL, HEXPIRE, HPERSIST) now defer their array/per-field replies using the deferred-reply buffer; a new unit test verifies behavior with blocking keyspace notifications.

Changes

Hash command deferred-reply mechanism

Layer / File(s) Summary
Deferred-reply wrapping in HGETDEL, HEXPIRE, and HPERSIST
src/t_hash.c
hgetdelCommand, hexpireGenericCommand, and hpersistCommand each call initDeferredReplyBuffer(c) before composing their array/per-field replies and commitDeferredReplyBuffer(c, 1) after processing.
Blocking keyspace notification test for HFE commands
tests/unit/moduleapi/block_keyspace_notification.tcl
New test exercises HEXPIRE, HGETDEL, and HPERSIST, asserting return values, key existence after HGETDEL, emitted keyspace events, and uses wait_for_blocked_clients_count checks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main fix: buffered_reply assert issues in HFE commands when modules subscribe to keyspace notifications.
Description check ✅ Passed The description is directly related to the changeset, explaining the bug (buffered_reply assert triggered by module keyspace notifications) and which commands are affected.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread tests/unit/moduleapi/block_keyspace_notification.tcl

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/moduleapi/block_keyspace_notification.tcl (1)

123-144: ⚡ Quick win

Assert notification side effects explicitly in this scenario.

This test mostly validates command return values, but it doesn’t directly assert that the expected keyspace notifications were emitted for the HFE operations under blocking notify conditions. Please add explicit b_keyspace.events assertions (or equivalent) for the exercised commands so the test guards the intended regression more directly.

As per coding guidelines, “Use clear assertions with meaningful error messages in tests.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/moduleapi/block_keyspace_notification.tcl` around lines 123 - 144,
The test exercises HFE commands but doesn't assert emitted keyspace
notifications; after the operations that should generate notifications (r
hexpire h1 ..., r hgetdel h1 ..., r hpersist h1 ... and any deletes), query the
blocker helper (b_keyspace.events or equivalent) and assert the expected event
entries exist and match the commands (e.g., EXPIRE/EXPIRED, HGETDEL/HDEL,
PERSIST) using clear assert_equal/assert_* calls with meaningful messages;
update the test block "HFE commands with blocking keyspace notify" to add these
b_keyspace.events assertions immediately after each command sequence
(referencing functions wait_for_blocked_clients_count, r b_keyspace.clear, r
hset, r hexpire, r hgetdel, r hpersist, and r del h1) so the test validates
notification side effects explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/moduleapi/block_keyspace_notification.tcl`:
- Around line 123-144: The test exercises HFE commands but doesn't assert
emitted keyspace notifications; after the operations that should generate
notifications (r hexpire h1 ..., r hgetdel h1 ..., r hpersist h1 ... and any
deletes), query the blocker helper (b_keyspace.events or equivalent) and assert
the expected event entries exist and match the commands (e.g., EXPIRE/EXPIRED,
HGETDEL/HDEL, PERSIST) using clear assert_equal/assert_* calls with meaningful
messages; update the test block "HFE commands with blocking keyspace notify" to
add these b_keyspace.events assertions immediately after each command sequence
(referencing functions wait_for_blocked_clients_count, r b_keyspace.clear, r
hset, r hexpire, r hgetdel, r hpersist, and r del h1) so the test validates
notification side effects explicitly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 38e11b5e-5b9b-4e7d-826a-c1580f19a812

📥 Commits

Reviewing files that changed from the base of the PR and between 5a604b3 and 5a4b93f.

📒 Files selected for processing (2)
  • src/t_hash.c
  • tests/unit/moduleapi/block_keyspace_notification.tcl

Signed-off-by: Binbin <binloveplay1314@qq.com>
@codecov

codecov Bot commented May 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.92%. Comparing base (ad3b33e) to head (dab0478).
⚠️ Report is 2 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3743      +/-   ##
============================================
+ Coverage     76.70%   76.92%   +0.22%     
============================================
  Files           162      162              
  Lines         80674    80695      +21     
============================================
+ Hits          61880    62074     +194     
+ Misses        18794    18621     -173     
Files with missing lines Coverage Δ
src/t_hash.c 95.57% <100.00%> (+0.09%) ⬆️

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@enjoy-binbin enjoy-binbin added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label May 18, 2026

@ranshid ranshid left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - 1 small comment about the test, but overall looks good.

Comment thread tests/unit/moduleapi/block_keyspace_notification.tcl
Signed-off-by: Binbin <binloveplay1314@qq.com>
@ranshid ranshid merged commit bfeb368 into valkey-io:unstable May 18, 2026
74 of 75 checks passed
@github-project-automation github-project-automation Bot moved this to To be backported in Valkey 9.0 May 18, 2026
@github-project-automation github-project-automation Bot moved this to To be backported in Valkey 9.1 May 18, 2026
valkeyrie-ops Bot pushed a commit that referenced this pull request May 18, 2026
…cations (#3743)

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand,
hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in
notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin enjoy-binbin deleted the fix_notify branch May 19, 2026 02:38
@github-project-automation github-project-automation Bot moved this to To be backported in Valkey 9.1 May 19, 2026
@github-project-automation github-project-automation Bot moved this to To be backported in Valkey 9.0 May 19, 2026
enjoy-binbin pushed a commit that referenced this pull request May 19, 2026
…3766)

The HFE-blocking-keyspace-notify test added in #3743 is racy and
intermittently fails with:

```
Expected '{event del key h1} {event hdel key h1}' to be equal to '{event hdel key h1}'
(context: ... cmd {assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]} proc ::test)
```

## Why it races

When `HGETDEL` empties the hash, `hgetdelCommand` fires two
notifications back-to-back on the main thread:

1. `notifyKeyspaceEvent(NOTIFY_HASH, "hdel", ...)`
2. `notifyKeyspaceEvent(NOTIFY_GENERIC, "del", ...)` (the key was
removed)

The `block_keyspace_notification` test module's callback always spawns a
background thread that sleeps 1s, appends to the event log, and
optionally unblocks the client. For the second notification,
`RM_BlockClient` returns `NULL` (the client is already blocked by the
first), so the second thread never unblocks anyone — it only appends
`"del"` to the log.

The first thread is the one that unblocks the client, so `HGETDEL`
returns as soon as that thread has logged `"hdel"`. The test client then
immediately reads `b_keyspace.events`, racing the second thread which
has not yet acquired the event-log mutex. On a loaded host or under
valgrind the second thread routinely loses the race, leaving only
`"hdel"` visible.

## Fix

`wait_for_condition` until both events have been logged before asserting
on their contents. This matches the pattern already used by the four
other event-log assertions in this file (e.g. the `Server-created
keyspace notification` and `Event that fires twice` blocks).

```tcl
wait_for_condition 50 100 {
    [llength [r b_keyspace.events]] >= 2
} else {
    fail "Did not see both hdel and del events: [r b_keyspace.events]"
}
assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]
```

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request May 20, 2026
…cations (#3743)

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand,
hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in
notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request May 20, 2026
…3766)

The HFE-blocking-keyspace-notify test added in #3743 is racy and
intermittently fails with:

```
Expected '{event del key h1} {event hdel key h1}' to be equal to '{event hdel key h1}'
(context: ... cmd {assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]} proc ::test)
```

## Why it races

When `HGETDEL` empties the hash, `hgetdelCommand` fires two
notifications back-to-back on the main thread:

1. `notifyKeyspaceEvent(NOTIFY_HASH, "hdel", ...)`
2. `notifyKeyspaceEvent(NOTIFY_GENERIC, "del", ...)` (the key was
removed)

The `block_keyspace_notification` test module's callback always spawns a
background thread that sleeps 1s, appends to the event log, and
optionally unblocks the client. For the second notification,
`RM_BlockClient` returns `NULL` (the client is already blocked by the
first), so the second thread never unblocks anyone — it only appends
`"del"` to the log.

The first thread is the one that unblocks the client, so `HGETDEL`
returns as soon as that thread has logged `"hdel"`. The test client then
immediately reads `b_keyspace.events`, racing the second thread which
has not yet acquired the event-log mutex. On a loaded host or under
valgrind the second thread routinely loses the race, leaving only
`"hdel"` visible.

## Fix

`wait_for_condition` until both events have been logged before asserting
on their contents. This matches the pattern already used by the four
other event-log assertions in this file (e.g. the `Server-created
keyspace notification` and `Event that fires twice` blocks).

```tcl
wait_for_condition 50 100 {
    [llength [r b_keyspace.events]] >= 2
} else {
    fail "Did not see both hdel and del events: [r b_keyspace.events]"
}
assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]
```

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request May 29, 2026
…cations (#3743)

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand,
hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in
notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request May 29, 2026
…3766)

The HFE-blocking-keyspace-notify test added in #3743 is racy and
intermittently fails with:

```
Expected '{event del key h1} {event hdel key h1}' to be equal to '{event hdel key h1}'
(context: ... cmd {assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]} proc ::test)
```

## Why it races

When `HGETDEL` empties the hash, `hgetdelCommand` fires two
notifications back-to-back on the main thread:

1. `notifyKeyspaceEvent(NOTIFY_HASH, "hdel", ...)`
2. `notifyKeyspaceEvent(NOTIFY_GENERIC, "del", ...)` (the key was
removed)

The `block_keyspace_notification` test module's callback always spawns a
background thread that sleeps 1s, appends to the event log, and
optionally unblocks the client. For the second notification,
`RM_BlockClient` returns `NULL` (the client is already blocked by the
first), so the second thread never unblocks anyone — it only appends
`"del"` to the log.

The first thread is the one that unblocks the client, so `HGETDEL`
returns as soon as that thread has logged `"hdel"`. The test client then
immediately reads `b_keyspace.events`, racing the second thread which
has not yet acquired the event-log mutex. On a loaded host or under
valgrind the second thread routinely loses the race, leaving only
`"hdel"` visible.

## Fix

`wait_for_condition` until both events have been logged before asserting
on their contents. This matches the pattern already used by the four
other event-log assertions in this file (e.g. the `Server-created
keyspace notification` and `Event that fires twice` blocks).

```tcl
wait_for_condition 50 100 {
    [llength [r b_keyspace.events]] >= 2
} else {
    fail "Did not see both hdel and del events: [r b_keyspace.events]"
}
assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]
```

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 10, 2026
…cations (#3743)

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand,
hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in
notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 10, 2026
…3766)

The HFE-blocking-keyspace-notify test added in #3743 is racy and
intermittently fails with:

```
Expected '{event del key h1} {event hdel key h1}' to be equal to '{event hdel key h1}'
(context: ... cmd {assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]} proc ::test)
```

## Why it races

When `HGETDEL` empties the hash, `hgetdelCommand` fires two
notifications back-to-back on the main thread:

1. `notifyKeyspaceEvent(NOTIFY_HASH, "hdel", ...)`
2. `notifyKeyspaceEvent(NOTIFY_GENERIC, "del", ...)` (the key was
removed)

The `block_keyspace_notification` test module's callback always spawns a
background thread that sleeps 1s, appends to the event log, and
optionally unblocks the client. For the second notification,
`RM_BlockClient` returns `NULL` (the client is already blocked by the
first), so the second thread never unblocks anyone — it only appends
`"del"` to the log.

The first thread is the one that unblocks the client, so `HGETDEL`
returns as soon as that thread has logged `"hdel"`. The test client then
immediately reads `b_keyspace.events`, racing the second thread which
has not yet acquired the event-log mutex. On a loaded host or under
valgrind the second thread routinely loses the race, leaving only
`"hdel"` visible.

## Fix

`wait_for_condition` until both events have been logged before asserting
on their contents. This matches the pattern already used by the four
other event-log assertions in this file (e.g. the `Server-created
keyspace notification` and `Event that fires twice` blocks).

```tcl
wait_for_condition 50 100 {
    [llength [r b_keyspace.events]] >= 2
} else {
    fail "Did not see both hdel and del events: [r b_keyspace.events]"
}
assert_equal [lsort {{event hdel key h1} {event del key h1}}] [lsort [r b_keyspace.events]]
```

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
ranshid added a commit that referenced this pull request Jun 11, 2026
# Backport sweep for 9.1

Automated cherry-picks from PRs marked "To be backported".

## Applied

| Source PR | Title | Detail |
|---|---|---|
| #3743 | Fix buffered_reply assert in HFE commands with module keyspace
notifications | cherry-picked in a prior sweep |
| #3766 | Fix flaky block_keyspace_notification test for HGETDEL notify
race | cherry-picked in a prior sweep |
| #3800 | Fix heap-use-after-free in ACL LOAD when client free is
deferred | cherry-picked in a prior sweep |
| #3723 | Fix double-finish and RESP reply violation in cluster slot
migration | cherry-picked in a prior sweep |
| #3872 | Redacting customer information when hide_user_data_from_log is
true in rdb.c, networking.c, debug.c and t_hash | cherry-picked in a
prior sweep |
| #3846 | Fix use-after-free in VM_RegisterClusterMessageReceiver |
cherry-picked in a prior sweep |
| #3806 | Add ALL_DBS flag to CLUSTER FLUSHSLOT for database-level ACL |
cherry-picked in a prior sweep |
| #3847 | Harden SENTINEL commands and config rewrite against
control-character injection | |
| #3801 | Validate every DB clause in COPY against ACL db= permissions |
|
| #3851 | Replace AUTOMATION_PAT with Valkeyrie Bot GitHub App token | |
| #3848 | Fix cluster AUX-field control-character and delimiter
injection | |
| #3544 | Revert "IO-Threads redesign cleanup work (#3544)" |
cherry-picked in a prior sweep |
| #3888 | Report exact dbid for COPY in ACL LOG when db= access is
denied | conflicts resolved by Claude Code |
| #3942 | Fix shard_id format specifier in UPDATE message log |  |
| #3941 | Avoid random() % 0 undefined behaviour when
cluster-node-timeout < 30 | |

---
*Generated by valkey-ci-agent using Claude Code.*

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: chx9 <lovelypiska@outlook.com>
Signed-off-by: zackcam <zackcam@amazon.com>
Signed-off-by: Eran Ifrah <eifrah@amazon.com>
Signed-off-by: Jules Lasarte <lasartej@amazon.com>
Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>
Signed-off-by: akash kumar <akumdev@amazon.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: lovelypiska <lovelypiska@outlook.com>
Co-authored-by: zackcam <zackcam@amazon.com>
Co-authored-by: eifrah-aws <eifrah@amazon.com>
Co-authored-by: jjuleslasarte <jules.lasarte@gmail.com>
Co-authored-by: Jules Lasarte <lasartej@amazon.com>
Co-authored-by: Akash Kumar <45854686+akashkgit@users.noreply.github.com>
@sarthakaggarwal97 sarthakaggarwal97 moved this from To be backported to Done in Valkey 9.1 Jun 15, 2026
valkeyrie-ops Bot pushed a commit that referenced this pull request Jun 17, 2026
…cations (#3743)

When a module subscribes to NOTIFY_HASH keyspace events and blocks the
client in the notification callback, hexpireGenericCommand,
hgetdelCommand,
and hpersistCommand would trigger debugServerAssert in
notifyKeyspaceEvent()
because addReplyArrayLen() sets buffered_reply=1 before the notification
fires. This debug assert was added in #1819.

Affected commands: HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETDEL,
HPERSIST.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

Status: To be backported
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants