fix(gateway): back off session tool mirrors under pressure#84846
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 24, 2026, 9:26 PM ET / 01:26 UTC. Summary PR surface: Source +48, Tests +124. Total +172 across 4 files. Reproducibility: no. high-confidence live reproduction was run in this read-only review. The linked issue provides redacted CPU and gateway logs, and current main source shows session tool mirrors are always sent when visible, so the source-level path is clear. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance: Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land the gateway reliability change only after maintainers accept the session-mirror delivery tradeoff and review redacted after-fix runtime proof for the pressure path. Do we have a high-confidence way to reproduce the issue? No high-confidence live reproduction was run in this read-only review. The linked issue provides redacted CPU and gateway logs, and current main source shows session tool mirrors are always sent when visible, so the source-level path is clear. Is this the best way to solve the issue? Mostly yes, pending maintainer approval. The patch changes the owning diagnostic heartbeat and gateway fanout seams and preserves terminal/run-scoped tool events, but the intentional delivery tradeoff should be accepted with after-fix runtime proof. Codex review notes: model gpt-5.5, reasoning high; reviewed against f37fbc9ef49e. Label changesLabel justifications:
Evidence reviewedPR surface: Source +48, Tests +124. Total +172 across 4 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
80286e6 to
9057559
Compare
|
Verification for 9057559: Local:
CI:
Known proof gap:
|
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
…84846) Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
Summary
Motivation
active=1 queued=3and a fetch timeout drifted by 7175ms. Backing off lower-priority mirror traffic during that condition reduces avoidable websocket work while preserving completion signals.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
node scripts/run-vitest.mjs src/logging/diagnostic.test.ts src/gateway/server-chat.agent-events.test.tsandgit diff --check.end,error,result) are preserved so UI tool cards can complete.Root Cause (if applicable)
Regression Test Plan (if applicable)
src/logging/diagnostic.test.ts,src/gateway/server-chat.agent-events.test.tsUser-visible / Behavior Changes
During diagnostic queue pressure, lower-priority session tool mirror updates may be dropped temporarily; terminal session tool events and run-scoped tool events are still delivered.
Diagram (if applicable)
Security Impact (required)
Yes/No): NoYes/No): NoYes/No): NoYes/No): NoYes/No): NoYes, explain risk + mitigation: N/ARepro + Verification
Environment
Steps
node scripts/run-vitest.mjs src/logging/diagnostic.test.ts src/gateway/server-chat.agent-events.test.ts.git diff --check.Expected
Actual
git diff --checkpassed with no output.Evidence
Attach at least one:
Human Verification (required)
What you personally verified (not just CI), and how:
Review Conversations
Compatibility / Migration
Yes/No): YesYes/No): NoYes/No): NoRisks and Mitigations