Skip to content

debugging test flakes#3028

Merged
jhrozek merged 44 commits intomainfrom
jerm/debugging-flakes
Dec 16, 2025
Merged

debugging test flakes#3028
jhrozek merged 44 commits intomainfrom
jerm/debugging-flakes

Conversation

@jerm-dro
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro commented Dec 14, 2025

changes:

  • dump k8s state on any gingko failure. We do this using JustAfterEach in order to intercept all failures before cleanup without having to manually instrument the test.
  • don't use example.com for fetch calls. Use a local server to avoid variable latency (see below).
  • when initializing clients, retry with a brand new client to avoid stale sessions (seen here).

Example.com failures

Identified a problem with exceptionally long fetch tool calls:

2025/12/15 17:39:00 Tool call received: fetch
2025/12/15 17:39:00 Fetching URL: https://example.com/
2025/12/15 17:39:30 HTTP 200 response from https://example.com/ (Content-Type: text/html)
2025/12/15 17:39:30 Successfully fetched 513 bytes from https://example.com/
2025/12/15 17:39:30 Fetch completed successfully for https://example.com,/ returning 149 characters

Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Dec 14, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.50%. Comparing base (45acc7c) to head (6ca34ea).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3028      +/-   ##
==========================================
- Coverage   56.51%   56.50%   -0.02%     
==========================================
  Files         334      334              
  Lines       33170    33170              
==========================================
- Hits        18747    18742       -5     
- Misses      12839    12844       +5     
  Partials     1584     1584              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 15, 2025
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 15, 2025
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 15, 2025
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 15, 2025
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 15, 2025
jhrozek
jhrozek previously approved these changes Dec 15, 2025
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The changes address the E2E test flakiness well:

  • InitializeMCPClientWithRetries properly creates fresh clients on each retry to avoid stale session state
  • Using an in-cluster mock HTTP server instead of example.com avoids network flakiness in CI
  • The state dump via JustAfterEach will help debug future test failures
  • Good to see the previously skipped tests re-enabled

Left one non-blocking suggestion about OpenShift compatibility.

Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 16, 2025
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 16, 2025
@jhrozek jhrozek merged commit 3b84eb0 into main Dec 16, 2025
33 checks passed
@jhrozek jhrozek deleted the jerm/debugging-flakes branch December 16, 2025 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants