Skip to content

fix(ext/fetch): retry on stale pooled HTTP/1.1 connections#32566

Merged
bartlomieju merged 3 commits intodenoland:mainfrom
bartlomieju:fix/fetch-retry-stale-connection
Mar 11, 2026
Merged

fix(ext/fetch): retry on stale pooled HTTP/1.1 connections#32566
bartlomieju merged 3 commits intodenoland:mainfrom
bartlomieju:fix/fetch-retry-stale-connection

Conversation

@bartlomieju
Copy link
Copy Markdown
Member

@bartlomieju bartlomieju commented Mar 7, 2026

Summary

Fixes #31955

When a server shuts down and restarts on the same port, the fetch client's
connection pool may still hold a stale keep-alive connection. Reusing it causes
connection closed before message completed errors on every other request.

The existing is_error_retryable() only handled HTTP/2 GOAWAY and
REFUSED_STREAM. This PR extends it to also retry on:

  • hyper::Error::is_incomplete_message() — connection closed before the
    message completed (the main error from the issue)
  • io::ErrorKind::ConnectionReset — TCP RST from a stale pooled connection

Both are safe to retry because the server never received/processed the request.
The retry is limited to one attempt (enforced by the existing Retried marker
extension) and only applies to requests with clonable bodies (not streaming).

Note: despite being reported as a regression in 2.6.6, this issue reproduces all
the way back to Deno 1.46.0 — it was a long-standing bug, not a regression.

bartlomieju and others added 2 commits March 7, 2026 20:40
When a server shuts down and restarts on the same port, the fetch
client's connection pool may still hold a stale keep-alive connection.
Reusing it causes "connection closed before message completed" errors.

Extend `is_error_retryable()` to also retry on:
- `hyper::Error::is_incomplete_message()` (connection closed mid-request)
- `io::ErrorKind::ConnectionReset` (TCP RST on stale connection)

These are safe to retry because the server never received the request.

Closes denoland#31955

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix clippy collapsible_if lint errors
- Handle ConnectionAborted (Windows os error 10053) in addition to
  ConnectionReset for cross-platform retry support
- Fix errorMessageIncludesUrlAndDetailsWithTcpInfo unit test to accept
  connections in a loop so retries hit the same error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@kajukitli kajukitli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, good fix for a frustrating issue

the retry logic is sound:

  • is_incomplete_message() catches the "connection closed before message completed" case
  • ConnectionReset / ConnectionAborted covers TCP RST from stale connections
  • both are safe to retry since the server never processed the request

the existing Retried marker ensures we don't retry infinitely, and the body-clonability check prevents retrying streaming requests.

nice that you also fixed the unit test to accept connections in a loop — otherwise the retry would hit a closed listener and fail differently.

one thought: might be worth adding a brief comment in the test explaining why we loop on accept (so future readers understand it's for retry coverage, not just cleanup).

@bartlomieju bartlomieju merged commit e0464f4 into denoland:main Mar 11, 2026
112 checks passed
@bartlomieju bartlomieju deleted the fix/fetch-retry-stale-connection branch March 11, 2026 23:02
Copy link
Copy Markdown
Contributor

@kajukitli kajukitli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the changes with full repo context. No issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression in deno 2.6.6: connection closed before message completed

3 participants