fix(gateway): clear stale chat stream buffers#75089
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 4:48 AM ET / 08:48 UTC. Summary PR surface: Source +9, Tests +97. Total +106 across 21 files. Reproducibility: yes. for source-level reproduction: current main writes raw assistant stream text into Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Keep the focused cleanup, have a maintainer resolve the draft/hold state and refresh checks on the current head, then land it as a narrow Gateway memory-cleanup fix if the hold is cleared. Do we have a high-confidence way to reproduce the issue? Yes for source-level reproduction: current main writes raw assistant stream text into Is this the best way to solve the issue? Yes for the narrow leak path: centralizing per-run cleanup in AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 00ca654c7405. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +9, Tests +97. Total +106 across 21 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper please re-review. I addressed the P2 by removing the |
|
@clawsweeper please re-review. Follow-ups from the last review are addressed:
Local checks passed: |
b146377 to
0efb768
Compare
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
merge-pr wrapper script fucked up the PR, hold merge |
5d02ce0 to
05ca0e3
Compare
|
Merged via squash.
Verification:
|
Summary
clearRunpath on normal finish, abort, timeout, and stale maintenance cleanup.bufferUpdatedAtso raw-only streamed buffers that never emitted deltas can be swept after their run is no longer active.Root cause
Gateway
rawBufferscan retain full streamed assistant text. Some cleanup paths removed only projected buffers and delta metadata, and suppressed/raw-only output could avoiddeltaSentAtentirely, so maintenance could miss it.Impact
When long or frequent streamed chat runs hit abort, timeout, suppressed-output, or missing terminal-event paths, the retained raw buffers can accumulate in the gateway process and cause gateway memory to grow over time. This PR ensures those per-run buffers are cleaned consistently so stale chat state does not become a gateway memory leak.
Related Issues
Related to #51821, which reported stale
chatRunState.bufferscleanup gaps for stuck runs.Contributes to the broader Gateway RSS/session accumulation tracker #54155, but does not fully close that tracker.
Review follow-up
mainso the PR no longer relies on the stale Apr 30 base.deltaLastBroadcastText,agentDeltaSentAt, andbufferedAgentEventsinChatRunState.clearRun()and full-state cleanup.toolStartDatasweep from this PR. The final diff stays focused on Gateway chat buffer cleanup and does not delete active tool-start records before their matching end event.CHANGELOG.mdentry for the user-facing Gateway chat buffer memory-growth fix.trusted: falseto clear the OpenGrep explicit-trust finding onserver.impl.ts.Real behavior proof
0efb7685cbac7de522fb6fc604643993dfa9dc4con macOS, using the realstartGatewayMaintenanceTimers()source path.node --import tsx --input-type=module, createdcreateChatRunState(), seeded a stale raw-only run inrawBuffers,bufferUpdatedAt, anddeltaLastBroadcastText, startedstartGatewayMaintenanceTimers({ ...gatewayMaintenanceDeps, chatRunState }), then waited for the real maintenance interval withawait new Promise((resolve) => setTimeout(resolve, 61_000)).{ "label": "before maintenance", "runId": "runtime-proof-75089", "rawBuffers": 1, "bufferUpdatedAt": 1, "buffers": 0, "deltaSentAt": 0, "deltaLastBroadcastLen": 0, "deltaLastBroadcastText": 1, "agentDeltaSentAt": 0, "bufferedAgentEvents": 0 }{ "label": "after maintenance", "runId": "runtime-proof-75089", "rawBuffers": 0, "bufferUpdatedAt": 0, "buffers": 0, "deltaSentAt": 0, "deltaLastBroadcastLen": 0, "deltaLastBroadcastText": 0, "agentDeltaSentAt": 0, "bufferedAgentEvents": 0 }rawBuffers,bufferUpdatedAt, anddeltaLastBroadcastText; projected/delta/agent maps were also empty.Validation
pnpm test src/gateway/server.reload.test.ts src/gateway/server-maintenance.test.ts src/gateway/chat-abort.test.ts(3 files, 32 tests passed)pnpm test src/gateway/server-methods/chat.abort-authorization.test.ts(1 file, 5 tests passed)pnpm test:gateway(299 files, 3781 tests passed)pnpm tsgo:corepnpm tsgo:core:testpnpm lint --threads=8(previous local PR validation)pnpm exec vitest run --config test/vitest/vitest.agents-core.config.ts src/agents/pi-embedded-subscribe.handlers.tools.test.tspnpm exec vitest run --config test/vitest/vitest.agents-core.config.ts src/agents/model-fallback.test.ts -t "does not classify successful results when the optional classifier returns null"(from earlier investigation)Note: full
agents-corehad one flaky timeout insrc/agents/model-fallback.test.ts; the timed-out test passed when rerun directly.