Skip to content

Fix bug causing no response flush sometimes when IO threads are busy#3205

Merged
murphyjacob4 merged 1 commit into
valkey-io:unstablefrom
murphyjacob4:unstable
Feb 14, 2026
Merged

Fix bug causing no response flush sometimes when IO threads are busy#3205
murphyjacob4 merged 1 commit into
valkey-io:unstablefrom
murphyjacob4:unstable

Conversation

@murphyjacob4

Copy link
Copy Markdown
Contributor

When we attempt to process commands from IO thread reads a second time, we never call handleClientsWithPendingWrites after that point. This sometimes causes the following sequence resulting in no response flush:

  1. After handleClientsWithPendingWrites, we attempt to call processIOThreadsReadDone again
  2. Within processIOThreadsReadDone, we write some response back to the client
  3. The response is unable to be sent to the IO thread (e.g. IO thread job queue full)
  4. !!! We never call handleClientsWithPendingWrites to process that write on the main thread or install the write handler !!!

Adding a second call to handleClientsWithPendingWrites should fix it. This call should be cheap since it fast returns when there are no pending writes.

Fixes #3198

Signed-off-by: Jacob Murphy <jkmurphy@google.com>
@murphyjacob4

Copy link
Copy Markdown
Contributor Author

On another note, I'm not sure if this "second read" gives much better performance. But this path is very unlikely to be triggered in our tests and feels like a time bomb given the low path coverage. Maybe we should consider removing it. But let's fix it first

@murphyjacob4

Copy link
Copy Markdown
Contributor Author

I think we need to backport down to 8.0. Although I had trouble reproducing since without the pipeline optimization it doesn't seem as likely to trigger.

https://github.com/valkey-io/valkey/blob/8.0/src/server.c#L1719-L1722

@murphyjacob4 murphyjacob4 added the bug Something isn't working label Feb 14, 2026
@murphyjacob4 murphyjacob4 moved this to To be backported in Valkey 8.0 Feb 14, 2026
@murphyjacob4 murphyjacob4 moved this to To be backported in Valkey 8.1 Feb 14, 2026
@murphyjacob4 murphyjacob4 moved this to To be backported in Valkey 9.0 Feb 14, 2026

@enjoy-binbin enjoy-binbin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a clever trick.

@murphyjacob4 murphyjacob4 merged commit 6268698 into valkey-io:unstable Feb 14, 2026
58 checks passed
@codecov

codecov Bot commented Feb 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (fd57c21) to head (09894a5).
⚠️ Report is 1 commits behind head on unstable.

Additional details and impacted files
@@       Coverage Diff        @@
##   unstable   #3205   +/-   ##
================================
================================
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@roshkhatri roshkhatri moved this from To be backported to 8.1.6 WIP in Valkey 8.1 Feb 17, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 17, 2026
@roshkhatri roshkhatri moved this from To be backported to 9.0.3 in Valkey 9.0 Feb 17, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 17, 2026
@roshkhatri roshkhatri moved this from To be backported to 8.0.7 (WIP) in Valkey 8.0 Feb 18, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 18, 2026
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 20, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 20, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 20, 2026
madolson pushed a commit that referenced this pull request Feb 24, 2026
madolson pushed a commit that referenced this pull request Feb 24, 2026
madolson pushed a commit that referenced this pull request Feb 24, 2026
hpatro pushed a commit to hpatro/valkey that referenced this pull request Mar 5, 2026
lmagomes pushed a commit to lmagomes/home-services that referenced this pull request May 12, 2026
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [docker.io/valkey/valkey](https://github.com/valkey-io/valkey) | image | patch | `9.0.1` → `9.0.4` |

---

### Release Notes

<details>
<summary>valkey-io/valkey (docker.io/valkey/valkey)</summary>

### [`v9.0.4`](https://github.com/valkey-io/valkey/releases/tag/9.0.4)

[Compare Source](valkey-io/valkey@9.0.3...9.0.4)

Upgrade urgency SECURITY: This release includes security fixes we recommend you
apply as soon as possible.

##### Security fixes

- (CVE-2026-23479) Use-After-Free in unblock client flow
- (CVE-2026-25243) Invalid Memory Access in RESTORE command
- (CVE-2026-23631) Use-after-free when full sync occurs during a yielding Lua/function execution

### [`v9.0.3`](https://github.com/valkey-io/valkey/releases/tag/9.0.3)

[Compare Source](valkey-io/valkey@9.0.2...9.0.3)

##### Valkey 9.0.3

Upgrade urgency SECURITY: This release includes security fixes we recommend you
apply as soon as possible.

##### Security fixes

- (CVE-2025-67733) RESP Protocol Injection via Lua error\_reply
- (CVE-2026-21863) Remote DoS with malformed Valkey Cluster bus message
- (CVE-2026-27623) Reset request type after handling empty requests

##### Bug fixes

- Avoids crash during MODULE UNLOAD when ACL rules reference a module command and subcommand ([#&#8203;3160](valkey-io/valkey#3160))
- Fix server assert on ACL LOAD when current user loses permission to channels ([#&#8203;3182](valkey-io/valkey#3182))
- Fix bug causing no response flush sometimes when IO threads are busy ([#&#8203;3205](valkey-io/valkey#3205))

### [`v9.0.2`](https://github.com/valkey-io/valkey/releases/tag/9.0.2)

[Compare Source](valkey-io/valkey@9.0.1...9.0.2)

Upgrade urgency HIGH: There are critical bugs that may affect a subset of users.

#### Bug fixes

- Avoid memory leak of new argv when HEXPIRE commands target only non-exiting fields ([#&#8203;2973](valkey-io/valkey#2973))
- Fix HINCRBY and HINCRBYFLOAT to update volatile key tracking ([#&#8203;2974](valkey-io/valkey#2974))
- Avoid empty hash object when HSETEX added no fields ([#&#8203;2998](valkey-io/valkey#2998))
- Fix case-sensitive check for the FNX and FXX arguments in HSETEX ([#&#8203;3000](valkey-io/valkey#3000))
- Prevent assertion in active expiration job after a hash with volatile fields is overwritten ([#&#8203;3003](valkey-io/valkey#3003), [#&#8203;3007](valkey-io/valkey#3007))
- Fix HRANDFIELD to return null response when no field could be found ([#&#8203;3022](valkey-io/valkey#3022))
- Fix HEXPIRE to not delete items when validation rules fail and expiration is in the past ([#&#8203;3023](valkey-io/valkey#3023), [#&#8203;3048](valkey-io/valkey#3048))
- Fix how hash is handling overriding of expired fields overwrite ([#&#8203;3060](valkey-io/valkey#3060))
- HSETEX - Always issue keyspace notifications after validation ([#&#8203;3001](valkey-io/valkey#3001))
- Make zero a valid TTL for hash fields during import mode and data loading ([#&#8203;3006](valkey-io/valkey#3006))
- Trigger prepareCommand on argc change in module command filters ([#&#8203;2945](valkey-io/valkey#2945))
- Restrict TTL from being negative and avoid crash in import-mode ([#&#8203;2944](valkey-io/valkey#2944))
- Fix chained replica crash when doing dual channel replication ([#&#8203;2983](valkey-io/valkey#2983))
- Skip slot cache optimization for AOF client to prevent key duplication and data corruption ([#&#8203;3004](valkey-io/valkey#3004))
- Fix used\_memory\_dataset underflow due to miscalculated used\_memory\_overhead ([#&#8203;3005](valkey-io/valkey#3005))
- Avoid duplicate calculations of network-bytes-out in slot stats with copy-avoidance ([#&#8203;3046](valkey-io/valkey#3046))
- Fix XREAD returning error on empty stream with + ID ([#&#8203;2742](valkey-io/valkey#2742))

#### Performance/Efficiency Improvements

- Track reply bytes in I/O threads if commandlog-reply-larger-than is -1 ([#&#8203;3086](valkey-io/valkey#3086), [#&#8203;3126](valkey-io/valkey#3126)).
  This makes it possible to mitigate a performance regression in 9.0.1 caused by the bug fix [#&#8203;2652](valkey-io/valkey#2652).

**Full Changelog**: <valkey-io/valkey@9.0.1...9.0.2>

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - "before 6am"
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNjkuNCIsInVwZGF0ZWRJblZlciI6IjQzLjE2OS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZSJdfQ==-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: 8.0.7 (WIP)
Status: 8.1.6
Status: 9.0.3

Development

Successfully merging this pull request may close these issues.

[BUG] IO Threads: Write-heavy pipelines sometimes hang on response flush (Data is written, Client hangs)

3 participants