Skip to content

Handle mid-session backend crashes gracefully#4053

Merged
yrobla merged 2 commits intomainfrom
issue-3875
Mar 11, 2026
Merged

Handle mid-session backend crashes gracefully#4053
yrobla merged 2 commits intomainfrom
issue-3875

Conversation

@yrobla
Copy link
Copy Markdown
Contributor

@yrobla yrobla commented Mar 9, 2026

Summary

Why: When a backend crashed mid-session, errors propagated with no indication of which backend failed, making it impossible for clients to distinguish a single-backend outage from a total session failure. This also violated the issue requirement that error messages identify the backend.

Fixes #3875

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Changes

What changed:

lookupBackend now returns *vmcp.BackendTarget alongside the connection
CallTool, ReadResource, and GetPrompt wrap backend errors as backend "" unavailable: , keeping the original error unwrappable via errors.Is
Session resilience was already correct (one backend crash never terminates the session); this change makes the error surface match the spec
Affected components: pkg/vmcp/session/default_session.go, default_session_test.go

Does this introduce a user-facing change?

No

Special notes for reviewers

@yrobla yrobla requested a review from Copilot March 9, 2026 14:54
@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Mar 9, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates vMCP session-scoped routing so that when an individual backend fails mid-session, the returned error clearly identifies which backend failed (per issue #3875), while keeping the original cause unwrappable.

Changes:

  • Extended lookupBackend to return the resolved *vmcp.BackendTarget alongside the backend session connection.
  • Wrapped backend errors in CallTool, ReadResource, and GetPrompt with a backend-identifying "backend <id> unavailable: ..." message.
  • Added/updated unit tests to validate backend ID presence and session resilience for tool calls across multiple backends.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
pkg/vmcp/session/default_session.go Returns backend target from lookup and wraps backend call errors with backend-identifying messages.
pkg/vmcp/session/default_session_test.go Updates CallTool error assertions and adds multi-backend crash resilience tests for tool calls.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 178fbeaf56

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.56%. Comparing base (9031587) to head (d02827d).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4053      +/-   ##
==========================================
+ Coverage   68.48%   68.56%   +0.07%     
==========================================
  Files         446      447       +1     
  Lines       45573    45672      +99     
==========================================
+ Hits        31211    31315     +104     
+ Misses      11948    11942       -6     
- Partials     2414     2415       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 9, 2026
jerm-dro
jerm-dro previously approved these changes Mar 9, 2026
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks that you can address now or in another PR

@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 10, 2026
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 10, 2026
Why: When a backend crashed mid-session, errors propagated with no indication
of which backend failed, making it impossible for clients to distinguish
a single-backend outage from a total session failure.
This also violated the issue requirement that error messages identify the backend.

Closes: #3875
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 10, 2026
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 10, 2026
@yrobla yrobla merged commit 1bf2f3a into main Mar 11, 2026
38 of 39 checks passed
@yrobla yrobla deleted the issue-3875 branch March 11, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Small PR: 100-299 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vMCP] Handle mid-session backend crashes gracefully

5 participants