Skip to content

Increase session management E2E test timeout to 3 minutes#4344

Merged
JAORMX merged 1 commit intomainfrom
fix-flaky-session-management-timeout
Mar 24, 2026
Merged

Increase session management E2E test timeout to 3 minutes#4344
JAORMX merged 1 commit intomainfrom
fix-flaky-session-management-timeout

Conversation

@JAORMX
Copy link
Copy Markdown
Collaborator

@JAORMX JAORMX commented Mar 24, 2026

Summary

The session management E2E tests were the only VirtualMCPServer tests using a 2-minute timeout for WaitForVirtualMCPServerReady — all other tests use 3 minutes. Under CI load (this test runs as spec ~120/121), backend pod startup can be slow enough that the health monitor accumulates 3 failures (at 30-second default intervals) before the backend is reachable. Recovery from unhealthy state requires 2 additional successful checks (another 60 seconds), pushing the total past the 2-minute window and causing a timeout with "All backends are unhealthy".

  • Increase timeout from 2 minutes to 3 minutes, matching all other VirtualMCPServer E2E tests
  • This fixes flaky failures in both session management contexts: "Session token storage and retrieval" and "Session token binding prevents session hijacking"

Type of change

  • Bug fix

Test plan

  • Verified all other VirtualMCPServer E2E tests already use 3-minute timeout
  • Confirmed the session management test was the only test file using 2-minute timeout

Special notes for reviewers

The worst-case health monitor timeline with default settings (30s interval, threshold 3):

  • T=0-60s: 3 failed checks → backend marked unhealthy
  • T=90s: first successful check → status transitions to degraded (still counted as "unhealthy" by countBackendHealth)
  • T=120s: second successful check → status transitions to healthy → VirtualMCPServer ready

With 2-minute timeout, the test races against this 150-second worst case. With 3 minutes, there is comfortable headroom. A deeper fix could configure shorter health check intervals on the test's VirtualMCPServer (like the circuit breaker and external auth tests do with 5s intervals), but the timeout alignment is the minimal safe change.

Generated with Claude Code

The session management tests used a 2-minute timeout for
WaitForVirtualMCPServerReady, while all other VirtualMCPServer E2E
tests use 3 minutes. With the default 30-second health check interval
and 3-failure unhealthy threshold, the health monitor needs up to
150 seconds to detect backend readiness if the first check fails
during pod startup — exceeding the 2-minute window and causing
flaky "All backends are unhealthy" timeouts under CI load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the size/XS Extra small PR: < 100 lines changed label Mar 24, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.90%. Comparing base (ccb98c3) to head (a985aad).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4344      +/-   ##
==========================================
- Coverage   68.95%   68.90%   -0.06%     
==========================================
  Files         479      479              
  Lines       48489    48489              
==========================================
- Hits        33438    33412      -26     
- Misses      12317    12343      +26     
  Partials     2734     2734              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX merged commit 14d1120 into main Mar 24, 2026
69 of 71 checks passed
@JAORMX JAORMX deleted the fix-flaky-session-management-timeout branch March 24, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XS Extra small PR: < 100 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants