fix(core): prevent auto-memory recall from blocking main request#3814
Conversation
bfeb9ce to
727e24f
Compare
727e24f to
03ce4fa
Compare
| return EMPTY_RELEVANT_AUTO_MEMORY_RESULT; | ||
| } | ||
|
|
||
| const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => { |
There was a problem hiding this comment.
[Suggestion] setTimeout in resolveAutoMemoryWithDeadline is never cleared via clearTimeout when the recall promise wins the race. The dangling timer can block the Node.js event loop from draining during graceful process exit — if process.exit() is called within 2.5s of the race resolving, the process will hang until the timer fires.
| const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => { | |
| async function resolveAutoMemoryWithDeadline( | |
| promise: Promise<RelevantAutoMemoryPromptResult> | undefined, | |
| ): Promise<RelevantAutoMemoryPromptResult> { | |
| if (!promise) { | |
| return EMPTY_RELEVANT_AUTO_MEMORY_RESULT; | |
| } | |
| let timer: ReturnType<typeof setTimeout>; | |
| const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => { | |
| timer = setTimeout( | |
| () => resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT), | |
| 2_500, | |
| ); | |
| }); | |
| try { | |
| return await Promise.race([promise, deadline]); | |
| } finally { | |
| clearTimeout(timer!); | |
| } | |
| } |
— deepseek-v4-pro via Qwen Code /review
| const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => { | ||
| setTimeout(() => resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT), 2_500); | ||
| }); | ||
|
|
There was a problem hiding this comment.
[Suggestion] Promise.race does not cancel the loser. When the 2.5s deadline wins, the recall promise continues executing in the background — including logMemoryRecall telemetry in recall.ts that emits a "successful recall" event for a result that was discarded. This inflates production recall-success metrics: oncall engineers see green dashboards while users actually received no memory context.
Consider passing a unified AbortSignal into the recall() pipeline so that when the deadline fires, the entire recall chain (scan → model call → heuristic fallback → telemetry) can abort early. At minimum, gate the logMemoryRecall call behind a signal check so discarded results are not counted.
— deepseek-v4-pro via Qwen Code /review
| async function resolveAutoMemoryWithDeadline( | ||
| promise: Promise<RelevantAutoMemoryPromptResult> | undefined, | ||
| ): Promise<RelevantAutoMemoryPromptResult> { | ||
| if (!promise) { |
There was a problem hiding this comment.
[Suggestion] The early return path if (!promise) return EMPTY_RELEVANT_AUTO_MEMORY_RESULT (triggered when getManagedAutoMemoryEnabled() is false) has no test coverage. The two new tests cover the deadline-hit and fast-completion paths, but the undefined-promise guard is untested.
Consider adding a test where config.getManagedAutoMemoryEnabled() returns false, then verify sendMessageStream completes normally without auto-memory content injected.
— deepseek-v4-pro via Qwen Code /review
…ll deadline Three changes from the review: 1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is now cleared in a finally block so the dangling timer doesn't block the Node.js event loop from draining during graceful process exit. 2. Telemetry inflation — An AbortController is created before the recall call and aborted when the 2.5s deadline fires. Its signal is passed into the recall pipeline via a new optional abortSignal field on ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls in recall.ts are gated behind !options.abortSignal?.aborted so discarded results don't emit success metrics. 3. Test coverage — Added a test for the !promise guard path (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream completes without calling recall or injecting memory content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the thorough review — all three items have been addressed. `setTimeout` leak in `resolveAutoMemoryWithDeadline` — The dangling timer is now cleared via `clearTimeout(timer)` in a `finally` block, so when the recall promise wins the race, the process can drain the event loop normally during graceful exit. Recall continues in background / telemetry inflation — Added an `AbortController` at the call site that gets aborted when the 2.5s deadline fires. The controller's signal is passed into the `recall()` pipeline via a new optional `abortSignal` field on `ResolveRelevantAutoMemoryPromptOptions`. All three `logMemoryRecall` calls in `recall.ts` are now gated behind `!options.abortSignal?.aborted`, so discarded results don't inflate recall-success metrics. Missing test for `!promise` guard — Added a test (`should proceed without auto-memory when managed auto-memory is disabled`) that mocks `getManagedAutoMemoryEnabled()` returning `false`, then verifies that `recall()` is never called and the main request proceeds without memory content. Validation:
|
| let timer: ReturnType<typeof setTimeout>; | ||
| const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => { | ||
| timer = setTimeout(() => { | ||
| onDeadline(); |
There was a problem hiding this comment.
[Critical] In resolveAutoMemoryWithDeadline, onDeadline() is called before resolve() in the setTimeout callback. If onDeadline throws (e.g., a rare edge case in AbortController.abort()), resolve() never executes. The deadline promise stays pending forever, Promise.race hangs, and sendMessageStream deadlocks at the await.
| onDeadline(); | |
| timer = setTimeout(() => { | |
| try { | |
| onDeadline(); | |
| } finally { | |
| resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT); | |
| } | |
| }, 2_500); |
— deepseek-v4-pro via Qwen Code /review
| private lastPromptId: string | undefined = undefined; | ||
| private lastSentIdeContext: IdeContext | undefined; | ||
| private forceFullIdeContext = true; | ||
| private _pendingRecallAbortController: AbortController | undefined; |
There was a problem hiding this comment.
[Critical] _pendingRecallAbortController is leaked on 4 early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal). The controller is set at line 710 but never aborted or cleared before these returns between lines 748-815. On the next sendMessageStream call, the stale controller is overwritten, and the background recall runs to completion with misleading telemetry.
Also note: the _ prefix is inconsistent with all other private fields in GeminiClient (chat, sessionTurnCount, lastPromptId, etc.). Consider pendingRecallAbortController without the underscore.
| private _pendingRecallAbortController: AbortController | undefined; | |
| // Wrap the recall-initiation-to-consumption span in try/finally: | |
| // At line 710 (after `this._pendingRecallAbortController = recallAbortController`), | |
| // ensure all early returns and exceptions clean up: | |
| try { | |
| // ... existing logic ... | |
| } finally { | |
| this._pendingRecallAbortController?.abort(); | |
| this._pendingRecallAbortController = undefined; | |
| } |
— deepseek-v4-pro via Qwen Code /review
| @@ -131,6 +131,8 @@ export interface ResolveRelevantAutoMemoryPromptOptions { | |||
| excludedFilePaths?: Iterable<string>; | |||
| limit?: number; | |||
| recentTools?: readonly string[]; | |||
There was a problem hiding this comment.
[Critical] The abortSignal added here is passed into recall() but never forwarded to selectRelevantAutoMemoryDocumentsByModel. The model selector creates its own independent AbortSignal.timeout(2_000) in relevanceSelector.ts:90, completely decoupled from this external abort signal. When the 2.5s deadline fires, the model API call keeps running — consuming quota for a discarded result. The runSideQuery in relevanceSelector.ts already accepts an abortSignal parameter; the plumbing exists but isn't connected.
| recentTools?: readonly string[]; | |
| // In recall.ts, pass abortSignal to the model selector: | |
| const selectedDocs = await selectRelevantAutoMemoryDocumentsByModel( | |
| options.config, | |
| query, | |
| docs, | |
| limit, | |
| options.recentTools, | |
| options.abortSignal, // forward the abort signal | |
| ); | |
| // In relevanceSelector.ts, combine with existing timeout: | |
| abortSignal: AbortSignal.any([ | |
| AbortSignal.timeout(2_000), | |
| callerAbortSignal, | |
| ].filter(Boolean)) |
— deepseek-v4-pro via Qwen Code /review
| excludedFilePaths: this.surfacedRelevantAutoMemoryPaths, | ||
| abortSignal: recallAbortController.signal, | ||
| }) | ||
| .catch((error: unknown) => { |
There was a problem hiding this comment.
[Suggestion] The .catch() error handler that swallows errors and returns EMPTY_RELEVANT_AUTO_MEMORY_RESULT has zero test coverage. No test simulates recall() rejection to verify the main request still completes. If this handler regresses, a failed recall becomes an unhandled promise rejection.
| .catch((error: unknown) => { | |
| // Add test in client.test.ts: | |
| it('should proceed normally when recall rejects', async () => { | |
| mockMemoryManager.recall.mockRejectedValue(new Error('recall failed')); | |
| const stream = client.sendMessageStream( | |
| [{ text: 'Quick question' }], | |
| new AbortController().signal, | |
| 'prompt-id-recall-fail', | |
| ); | |
| for await (const _ of stream) { /* consume */ } | |
| expect(mockTurnRunFn).toHaveBeenCalledWith( | |
| 'test-model', | |
| expect.not.arrayContaining([expect.stringContaining('Relevant memory')]), | |
| expect.any(AbortSignal), | |
| ); | |
| }); |
— deepseek-v4-pro via Qwen Code /review
| const relevantAutoMemory = relevantAutoMemoryPromise | ||
| ? await relevantAutoMemoryPromise | ||
| : EMPTY_RELEVANT_AUTO_MEMORY_RESULT; | ||
| const recallAbortController = this._pendingRecallAbortController; |
There was a problem hiding this comment.
[Suggestion] The deadline timer in resolveAutoMemoryWithDeadline starts when the function is called here (consumption time), not when the recall was initiated at line ~692. If the intermediate code (recording, microcompact, compression, token check, IDE context) takes significant time, the effective deadline shrinks, potentially leaving no time for the heuristic fallback. Consider racing the promise with the deadline at initiation time and storing the raced result instead.
| const recallAbortController = this._pendingRecallAbortController; | |
| // At initiation (line ~695): | |
| const deadlineResult = resolveAutoMemoryWithDeadline( | |
| recallPromise, | |
| () => recallAbort.abort(), | |
| ); | |
| // Store deadlineResult and consume it directly here. |
— deepseek-v4-pro via Qwen Code /review
Critical fixes: - Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so resolve() always runs even if onDeadline() throws, preventing deadlock - Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, combined with the existing 2s timeout via AbortSignal.any() so the model API call is cancelled when the 2.5s deadline fires - Abort and clear pendingRecallAbortController on all early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal) - Rename _pendingRecallAbortController → pendingRecallAbortController (inconsistent underscore prefix) Tests added: - client.test.ts: verify sendMessageStream completes normally when recall rejects - relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout, and timeout-only signal when no caller signal provided
|
Thanks for the thorough review — all concerns have been addressed. Deadline timer moved from consumption to initiation time: Previously addressed (from earlier commits):
Validation:
|
wenshao
left a comment
There was a problem hiding this comment.
Additional findings (not tied to specific diff lines)
[Suggestion] resetChat() (line ~269) clears surfacedRelevantAutoMemoryPaths but does not abort/clear pendingRecallAbortController. If resetChat() is called during an in-flight recall, the stale controller persists into the next session. Consider adding this.pendingRecallAbortController?.abort(); this.pendingRecallAbortController = undefined; in resetChat().
[Suggestion] The catch block log in recall.ts (~line 221) says "Model-driven auto-memory recall failed; falling back to heuristic selection" — but when the deadline triggers an abort, the model didn't fail, it was cancelled. The heuristic result is also discarded (deadline already resolved). Consider distinguishing AbortError from real model errors to avoid misleading oncall debugging.
— deepseek-v4-pro via Qwen Code /review
| onDeadline(); | ||
| } finally { | ||
| resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT); | ||
| } |
There was a problem hiding this comment.
[Suggestion] When the 2.5s deadline fires, the system silently proceeds without memory context — no warning log, no telemetry event, no observable signal. Oncall engineers cannot detect recall degradation from logs or monitoring.
Consider adding a debugLogger.warn and a recall_timed_out telemetry event before the deadline resolves with EMPTY_RELEVANT_AUTO_MEMORY_RESULT.
— deepseek-v4-pro via Qwen Code /review
| private lastPromptId: string | undefined = undefined; | ||
| private lastSentIdeContext: IdeContext | undefined; | ||
| private forceFullIdeContext = true; | ||
| private pendingRecallAbortController: AbortController | undefined; |
There was a problem hiding this comment.
[Suggestion] pendingRecallAbortController is stored as an instance field. If sendMessageStream is ever called concurrently (no mutex guard currently), a second call would overwrite the field, causing the first call's early-return cleanup to abort the wrong controller. The onDeadline closure captures the correct controller (asymmetric correctness).
Consider using a local variable within sendMessageStream instead of an instance field, so each invocation has its own controller.
— deepseek-v4-pro via Qwen Code /review
| this.config.getMaxSessionTurns() > 0 && | ||
| this.sessionTurnCount > this.config.getMaxSessionTurns() | ||
| ) { | ||
| this.pendingRecallAbortController?.abort(); |
There was a problem hiding this comment.
[Suggestion] The 4 early-return paths (MaxSessionTurns, boundedTurns === 0, SessionTokenLimitExceeded, Arena control signal) each clean up pendingRecallAbortController with ?.abort() + = undefined, but none of these paths have test coverage verifying the abort actually happens. If the cleanup is accidentally removed or reordered, it would not be caught.
Consider adding tests for each early-return path that assert the abort signal passed to mockMemoryManager.recall is in the aborted state after the return.
— deepseek-v4-pro via Qwen Code /review
| excludedFilePaths?: Iterable<string>; | ||
| limit?: number; | ||
| recentTools?: readonly string[]; | ||
| /** When provided and aborted, suppresses logMemoryRecall telemetry for discarded results. */ |
There was a problem hiding this comment.
[Suggestion] The JSDoc for abortSignal says "suppresses logMemoryRecall telemetry for discarded results" — but the signal's primary role is to propagate the caller's deadline abort to the model-driven selector so the LLM call is actually cancelled. The telemetry suppression is a side effect, not the main purpose.
Update the JSDoc to document both roles: abort propagation (primary) and telemetry suppression (secondary).
— deepseek-v4-pro via Qwen Code /review
| @@ -131,6 +131,8 @@ export interface ResolveRelevantAutoMemoryPromptOptions { | |||
| excludedFilePaths?: Iterable<string>; | |||
| limit?: number; | |||
| recentTools?: readonly string[]; | |||
There was a problem hiding this comment.
[Suggestion] resolveRelevantAutoMemoryPromptForQuery has zero unit tests — packages/core/src/memory/ contains no test file for recall.ts. The new abortSignal parameter's conditional suppression of logMemoryRecall at 3 sites (lines 173, 204, 232) is only exercised indirectly through client.test.ts where mockMemoryManager.recall bypasses the real function.
Consider adding packages/core/src/memory/recall.test.ts with tests that pass an already-aborted AbortSignal and assert logMemoryRecall is not called.
— deepseek-v4-pro via Qwen Code /review
| expect(runSideQuery).not.toHaveBeenCalled(); | ||
| }); | ||
|
|
||
| it('forwards caller abort signal to runSideQuery combined with timeout', async () => { |
There was a problem hiding this comment.
[Suggestion] The two new abort signal tests only assert passedSignal is instanceof AbortSignal, but never verify that when the caller signal is aborted, the combined AbortSignal.any([...]) signal is also in the aborted state. The core signal propagation logic is untested.
Add a test: create an AbortController, call .abort(), pass its signal as callerAbortSignal, then assert passedSignal.aborted === true.
— deepseek-v4-pro via Qwen Code /review
| ); | ||
| }); | ||
|
|
||
| it('should not block the main request when auto-memory recall is slow', async () => { |
There was a problem hiding this comment.
[Suggestion] The slow-recall test verifies that memory content is excluded from the main request (the result), but does not verify that recallAbortController.abort() was actually called (the mechanism). If the onDeadline callback is refactored away, the test still passes while production silently wastes resources.
Consider capturing the AbortSignal passed to mockMemoryManager.recall, advancing timers past the deadline, and asserting capturedSignal.aborted === true.
— deepseek-v4-pro via Qwen Code /review
| contents, | ||
| schema: RESPONSE_SCHEMA, | ||
| abortSignal: AbortSignal.timeout(5_000), | ||
| abortSignal: callerAbortSignal |
There was a problem hiding this comment.
[Suggestion] The model-driven selector timeout was reduced from 5s to 2s, and the 2.5s client deadline is hardcoded in a separate file with no shared constant or invariant check. If someone adjusts the model timeout above 2.5s, the deadline always wins, permanently disabling model-driven selection with no warning.
Consider: (1) keeping the standalone timeout at 5s when no callerAbortSignal is present (only use 2s inside AbortSignal.any); (2) extracting both timeout values to a shared constant or adding a runtime invariant(modelTimeoutMs < deadlineMs) check.
— deepseek-v4-pro via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
Round 3 review
The abort plumbing looks solid now — signal threads cleanly from the deadline through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, the four early-return paths all clear pendingRecallAbortController, and logMemoryRecall is gated on !aborted so discarded results don't pollute success metrics. The try/finally around onDeadline() and the move of the race to initiation time both look right. Net-net, this is in much better shape than round 1.
One thing I'd like to see before approving:
Missing test for early-return abort cleanup
The four pendingRecallAbortController?.abort() cleanups added in 1edd7fdd (MaxSessionTurns / boundedTurns=0 / SessionTokenLimitExceeded / arena cancel) are net-new behavior with zero test coverage. If a future refactor accidentally drops one of these, recall will keep running in the background after the user-facing turn already returned — exactly the silent-degradation class of bug this PR set out to fix. Please add at least one direct test.
Suggested starting point, modeled on the existing should yield MaxSessionTurns and stop when session turn limit is reached test:
it('should abort the pending recall when MaxSessionTurns is hit', async () => {
vi.spyOn(client['config'], 'getMaxSessionTurns').mockReturnValue(1);
client['sessionTurnCount'] = 1; // already at limit; next call exceeds it
const recallAbortHandler = vi.fn();
mockMemoryManager.recall.mockImplementation((_root, _query, opts) => {
opts.abortSignal?.addEventListener('abort', recallAbortHandler);
return new Promise(() => {}); // never resolves
});
const mockStream = (async function* () {
yield { type: 'content', value: 'Hello' };
})();
mockTurnRunFn.mockReturnValue(mockStream);
const mockChat: Partial<GeminiChat> = {
addHistory: vi.fn(),
getHistory: vi.fn().mockReturnValue([]),
};
client['chat'] = mockChat as GeminiChat;
const stream = client.sendMessageStream(
[{ text: 'over the limit' }],
new AbortController().signal,
'prompt-id-over-limit',
);
const events = [];
for await (const event of stream) {
events.push(event);
}
expect(events).toEqual([{ type: GeminiEventType.MaxSessionTurns }]);
expect(recallAbortHandler).toHaveBeenCalledTimes(1);
});Same shape works for SessionTokenLimitExceeded and the arena-control path; boundedTurns=0 is low priority (only triggers when turns=0 is passed in explicitly), so one or two of these is enough.
Non-blocking suggestions (your call, fine as follow-ups)
-
clearTimeout(timer!)non-null assertion is a bit fragile — it relies on the implicit "Promise constructor runs executor synchronously" contract. Safer:let timer: ReturnType<typeof setTimeout> | undefined; ... finally { if (timer !== undefined) clearTimeout(timer); }
-
recall.tsheuristic fallback ignores abort — when the model selector aborts, the catch falls through to the synchronous heuristic, which doesn't checkabortSignal.aborted. The work itself is fast and in-memory so this is harmless, but strictly it shouldn't produce results after the deadline. One-line guard at the top of the catch block. -
Log level for expected aborts —
debugLogger.warn('Managed auto-memory recall prefetch failed.', error)will fire for every deadline-triggered abort, which is the designed path on a slow setup. Consider downgradingAbortErrortodebugso the warn channel stays meaningful:.catch((error) => { if ((error as { name?: string })?.name === 'AbortError') { debugLogger.debug('Auto-memory recall aborted by deadline.'); } else { debugLogger.warn('Managed auto-memory recall prefetch failed.', error); } return EMPTY_RELEVANT_AUTO_MEMORY_RESULT; });
-
PR description — the move from race-at-consumption to race-at-initiation in
857a306cis a deliberate trade-off (consistent recall budget vs. lost overlap with intermediate work). Worth a sentence in the PR description so future readers don't have to dig through commits. -
Follow-up idea, not for this PR — issue #3759 also suggested routing the recall selector to
fastModel. Orthogonal to this PR; with a fast model behind it the deadline could probably drop to ~1s. Worth a follow-up issue if you don't already have one.
Once the early-return abort test (item above) lands I'll approve.
- resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message.
|
@wenshao (via deepseek-v4-pro) — Both suggestions addressed: 1. resetChat() not aborting pendingRecallAbortController — Fixed. Now aborts and clears the controller before calling 2. Misleading catch-block log in recall.ts — Fixed. Now distinguishes
This way an oncall engineer reading logs can tell whether the recall was cancelled by the deadline (normal) vs an actual model failure (investigate). Validation: 128/128 tests pass across 16 test files (client.test.ts + memory/). Build clean. Commit: |
|
Both suggestions from the review have been addressed and pushed (commit 167970e). 128/128 tests pass. |
wenshao
left a comment
There was a problem hiding this comment.
Quick nudge — 167970efe looks good but I think it crossed wires with my round-3 review.
Acknowledged from this commit:
resetChat()now aborts + clearspendingRecallAbortController✓recall.tscatch distinguishesAbortError(debug) from real model errors (warn) ✓ — this is exactly my round-3 item #3, and theerror instanceof DOMException && error.name === 'AbortError'check is actually narrower / better than the.name-only check I suggested
Still outstanding from my round-3 review (the CHANGES_REQUESTED one I posted at 02:04 UTC, separate from the deepseek-v4-pro one you replied to):
🔴 #1 — Missing test for the four pendingRecallAbortController?.abort() early-return paths added in 1edd7fdd (MaxSessionTurns / boundedTurns=0 / SessionTokenLimitExceeded / arena cancel). I posted a copy-pasteable test sketch in that review — modeled on the existing should yield MaxSessionTurns and stop when session turn limit is reached test, uses mockMemoryManager.recall.mockImplementation to register an abort listener, asserts recallAbortHandler fires once. This is the only blocker — my round-3 review explicitly said "Once the early-return abort test lands I'll approve."
The other items in that review (#2 clearTimeout(timer!) non-null, #4 PR description, #5 fastModel follow-up) are all optional / nice-to-have and don't gate merge.
Also FYI: the branch is now mergeable: CONFLICTING — main moved while you were iterating. Will need a rebase before merge.
CI also hasn't run on 167970efe yet (likely held by the conflict state).
- Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com>
…ll deadline Three changes from the review: 1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is now cleared in a finally block so the dangling timer doesn't block the Node.js event loop from draining during graceful process exit. 2. Telemetry inflation — An AbortController is created before the recall call and aborted when the 2.5s deadline fires. Its signal is passed into the recall pipeline via a new optional abortSignal field on ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls in recall.ts are gated behind !options.abortSignal?.aborted so discarded results don't emit success metrics. 3. Test coverage — Added a test for the !promise guard path (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream completes without calling recall or injecting memory content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Critical fixes: - Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so resolve() always runs even if onDeadline() throws, preventing deadlock - Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, combined with the existing 2s timeout via AbortSignal.any() so the model API call is cancelled when the 2.5s deadline fires - Abort and clear pendingRecallAbortController on all early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal) - Rename _pendingRecallAbortController → pendingRecallAbortController (inconsistent underscore prefix) Tests added: - client.test.ts: verify sendMessageStream completes normally when recall rejects - relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout, and timeout-only signal when no caller signal provided
- resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message.
8ed1193 to
60238c7
Compare
- Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com>
wenshao
left a comment
There was a problem hiding this comment.
Round 4 review
The 4 early-return abort tests + arena/heuristic abort plumbing all look right now on 60238c7b1; CI is green across all 11 platforms. No blockers — items below are non-blocking suggestions / latent risks worth filing as follow-ups if not addressed inline.
Non-blocking suggestions
1. relevanceSelector.test.ts abort-forwarding test asserts type, not behavior.
The current assertion expect(passedSignal).toBeInstanceOf(AbortSignal) would still pass if the code regressed to abortSignal: AbortSignal.timeout(2_000) without the AbortSignal.any(...) combine — both branches return an AbortSignal. The contract this test is supposed to lock in ("aborting the caller signal aborts the combined signal forwarded to runSideQuery") has no regression protection. Suggested:
const callerController = new AbortController();
let capturedSignal: AbortSignal | undefined;
vi.mocked(runSideQuery).mockImplementation(async (_config, opts) => {
capturedSignal = opts.abortSignal;
return { selected_memories: [] };
});
await selectRelevantAutoMemoryDocumentsByModel(
mockConfig, 'check preferences', docs, 2, [], callerController.signal,
);
callerController.abort();
expect(capturedSignal!.aborted).toBe(true);2. arena-cancel early-return abort path has no direct test.
Same shape as the new MaxSessionTurns / SessionTokenLimitExceeded tests, just with an arenaAgentClient mock returning a control signal. Three of the four early-return paths added in 1edd7fdd are now covered; the arena one is the odd one out. Skip is fine if you'd rather follow-up, but the round-3 ask was about all four.
3. client.ts:918-920 comment is slightly misleading.
// The recall promise was already raced against the 2.5s deadline at
// initiation time; this await just collects the result.
this.pendingRecallAbortController = undefined;The comment explains why we can await directly, but not why we can clear the controller before the await — a future reader will worry the in-flight recall loses its kill-switch. Suggested addendum: "The deadline timer keeps running independently inside `resolveAutoMemoryWithDeadline`; clearing the field here is safe because no further early-return path can fire after this point."
4. Concurrent `sendMessageStream` calls would race on `pendingRecallAbortController` (latent).
Because the field is single-valued, if a UserQuery is mid-flight and a Notification call enters and hits MaxSessionTurns, the Notification will `?.abort()` the UserQuery's controller — Notification doesn't initialize a new controller (it's gated on `UserQuery || Cron`), so it sees the stale field. Mitigated in practice by the deadline timer + the CLI processing one stream at a time, but the code doesn't enforce this. If you want to harden, switch to `Set` (add on init, delete on consumption / abort). Fine as a follow-up.
5. Magic numbers `2_500` and `2_000`.
These two constants are semantically linked (deadline = model timeout + 0.5s heuristic-fallback budget) but live in different files. A pair of named constants with JSDoc explaining the relationship would help readers — and makes the eventual fast-model retune (#3759 follow-up) a single-line change.
Round-3 items confirmed addressed in this commit (60238c7b1)
- All 4 early-return paths abort + null the controller (3 of 4 now have direct tests)
- `resetChat()` aborts in-flight recall
- `clearTimeout(timer)` uses explicit `undefined` check, no non-null assertion
- `recall.ts` catch distinguishes `AbortError` (debug) from real failures (warn)
- `client.ts` recall `.catch` does the same
- `logMemoryRecall` gated on `!options.abortSignal?.aborted` in all three call sites
- Heuristic fallback skipped when caller abort already set
- Deadline race moved from consumption-time to initiation-time
LGTM modulo the suggestions above — happy to approve once #1 (real abort-propagation assertion) lands. Others can be follow-ups.
QwenLM#3759) The auto-memory recall side-query had a 5s AbortSignal.timeout that fired on every turn, and the main request path awaited the full recall promise (including timeout + heuristic fallback). This caused a ~5s delay on every user turn. Changes: - relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s - client.ts: add resolveAutoMemoryWithDeadline() that races the recall promise against a 2.5s deadline, returning empty result if recall hasn't completed in time - client.test.ts: add 2 tests verifying slow recall doesn't block the main request and fast recall still includes memory content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
stripThoughtsFromHistory was removed from GeminiChat on main. The mock in client.test.ts still referenced it, causing TS2353 build failures in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ll deadline Three changes from the review: 1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is now cleared in a finally block so the dangling timer doesn't block the Node.js event loop from draining during graceful process exit. 2. Telemetry inflation — An AbortController is created before the recall call and aborted when the 2.5s deadline fires. Its signal is passed into the recall pipeline via a new optional abortSignal field on ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls in recall.ts are gated behind !options.abortSignal?.aborted so discarded results don't emit success metrics. 3. Test coverage — Added a test for the !promise guard path (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream completes without calling recall or injecting memory content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Critical fixes: - Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so resolve() always runs even if onDeadline() throws, preventing deadlock - Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, combined with the existing 2s timeout via AbortSignal.any() so the model API call is cancelled when the 2.5s deadline fires - Abort and clear pendingRecallAbortController on all early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal) - Rename _pendingRecallAbortController → pendingRecallAbortController (inconsistent underscore prefix) Tests added: - client.test.ts: verify sendMessageStream completes normally when recall rejects - relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout, and timeout-only signal when no caller signal provided
|
Thanks —
Only thing left: branch is |
- resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message.
- Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com>
60238c7 to
d1fe01d
Compare
|
Commit: Thanks for the thorough round 4 review — the abort-propagation test fix is exactly right. Abort-propagation test now asserts behavior, not type: The test in
This directly tests the ESLint fix: Removed the Rebase: Branch rebased onto latest Root cause of pre-existing test failures: Stale compiled Validation:
|
Moves the resolveAutoMemoryWithDeadline() race from consumption time to initiation time so the 2.5s budget is not consumed by intermediate work (microcompact, compression, token checks, IDE context) between recall initiation and consumption. The raced promise is stored directly in relevantAutoMemoryPromise and simply awaited at consumption. The pendingRecallAbortController cleanup on early return paths is preserved unchanged.
- resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message.
- Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com>
d1fe01d to
14ae8d6
Compare
…r.test.ts The import/no-internal-modules eslint-disable comment was unnecessary — the rule does not flag same-package imports like ../utils/sideQuery.js. ESLint 9 reports it as an unused directive, failing the CI lint check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Local E2E verificationReproduced the bug and confirmed the fix end-to-end against a real Setup
Wall-clock
~2.6s per turn, ~23% faster — matches the expected gap from cutting recall blocking from ~5s to the 2.5s deadline. Debug-log timing (from
|
| Build | init → recall abort warn | Notes |
|---|---|---|
| PR run 1 | 2.118s | 2s selector timeout fires deterministically |
| PR run 2 | 2.119s | same |
| PR run 3 | 2.119s | same |
| main run 1 | no warn | selector returned within 5s; main path still synchronously waited ~3–5s |
| main run 2 | no warn | same |
| main run 3 | no warn | same |
Useful confirmation: the bug isn't strictly "selector hits the 5s timeout every turn." On deepseek-v4-pro the selector often completes in 3–5s, but the main path waits for it regardless. That matches the "delayed by ~5s − prepDuration" framing in #3759, and the PR's choice to race a deadline at initiation (rather than just shrinking the selector timeout) is the right shape of fix.
One observation worth a follow-up (non-blocking)
recall.ts:224 distinguishes AbortError from real model errors with error instanceof DOMException && error.name === 'AbortError'. In the real abort path this branch never fires — the OpenAI client wraps the abort as a plain Error with message "Request was aborted.", so every abort falls through to the debugLogger.warn(...) path on line 230. Functional fix is unaffected (recall still aborts, main path is still unblocked), but the log-differentiation introduced for oncall debugging doesn't actually differentiate in practice. Worth a small follow-up — match on the wrapped message, or detect at a layer where the abort hasn't been re-wrapped.
wenshao
left a comment
There was a problem hiding this comment.
LGTM. Reviewed across 4 rounds, all blocking items addressed; local E2E reproduction confirms the ~2.5s/turn improvement (see comment above).
…nLM#3836) - fix(core): prevent auto-memory recall from blocking main request (QwenLM#3814) - feat(core,cli): surface and cancel auto-memory dream tasks (QwenLM#3836) - fix(sdk-python): standardize TAG_PREFIX to include v suffix (QwenLM#3832) Conflict resolutions: - Footer.tsx: removed stale dreamRunning/configInitMessage (not used in render) - BackgroundTaskViewContext.tsx: kept @hoptrendy/hopcode-core, added createDebugLogger - manager.test.ts: included upstream cancelTask + subscribe filter tests, replaced QWEN_CODE_MEMORY_LOCAL with HOPCODE_MEMORY_LOCAL - get-release-version.js: kept hopcode-sdk brand, adopted upstream v prefix fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(core): prevent auto-memory recall from blocking main request (issue #3759) The auto-memory recall side-query had a 5s AbortSignal.timeout that fired on every turn, and the main request path awaited the full recall promise (including timeout + heuristic fallback). This caused a ~5s delay on every user turn. Changes: - relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s - client.ts: add resolveAutoMemoryWithDeadline() that races the recall promise against a 2.5s deadline, returning empty result if recall hasn't completed in time - client.test.ts: add 2 tests verifying slow recall doesn't block the main request and fast recall still includes memory content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): remove stripThoughtsFromHistory from GeminiChat mocks stripThoughtsFromHistory was removed from GeminiChat on main. The mock in client.test.ts still referenced it, causing TS2353 build failures in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(core): address PR #3814 review comments on auto-memory recall deadline Three changes from the review: 1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is now cleared in a finally block so the dangling timer doesn't block the Node.js event loop from draining during graceful process exit. 2. Telemetry inflation — An AbortController is created before the recall call and aborted when the 2.5s deadline fires. Its signal is passed into the recall pipeline via a new optional abortSignal field on ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls in recall.ts are gated behind !options.abortSignal?.aborted so discarded results don't emit success metrics. 3. Test coverage — Added a test for the !promise guard path (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream completes without calling recall or injecting memory content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(core): address PR #3814 review findings for auto-memory recall Critical fixes: - Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so resolve() always runs even if onDeadline() throws, preventing deadlock - Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, combined with the existing 2s timeout via AbortSignal.any() so the model API call is cancelled when the 2.5s deadline fires - Abort and clear pendingRecallAbortController on all early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal) - Rename _pendingRecallAbortController → pendingRecallAbortController (inconsistent underscore prefix) Tests added: - client.test.ts: verify sendMessageStream completes normally when recall rejects - relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout, and timeout-only signal when no caller signal provided * fix(core): start auto-memory recall deadline at initiation time Moves the resolveAutoMemoryWithDeadline() race from consumption time to initiation time so the 2.5s budget is not consumed by intermediate work (microcompact, compression, token checks, IDE context) between recall initiation and consumption. The raced promise is stored directly in relevantAutoMemoryPromise and simply awaited at consumption. The pendingRecallAbortController cleanup on early return paths is preserved unchanged. * fix(core): address PR #3814 review feedback - resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message. * fix(core): address PR #3814 round 3 review feedback - Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com> * test(core): assert abort propagation in relevance selector * fix(lint): remove unused eslint-disable directive in relevanceSelector.test.ts The import/no-internal-modules eslint-disable comment was unnecessary — the rule does not flag same-package imports like ../utils/sideQuery.js. ESLint 9 reports it as an unused directive, failing the CI lint check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nLM#3814) * fix(core): prevent auto-memory recall from blocking main request (issue QwenLM#3759) The auto-memory recall side-query had a 5s AbortSignal.timeout that fired on every turn, and the main request path awaited the full recall promise (including timeout + heuristic fallback). This caused a ~5s delay on every user turn. Changes: - relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s - client.ts: add resolveAutoMemoryWithDeadline() that races the recall promise against a 2.5s deadline, returning empty result if recall hasn't completed in time - client.test.ts: add 2 tests verifying slow recall doesn't block the main request and fast recall still includes memory content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): remove stripThoughtsFromHistory from GeminiChat mocks stripThoughtsFromHistory was removed from GeminiChat on main. The mock in client.test.ts still referenced it, causing TS2353 build failures in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(core): address PR QwenLM#3814 review comments on auto-memory recall deadline Three changes from the review: 1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is now cleared in a finally block so the dangling timer doesn't block the Node.js event loop from draining during graceful process exit. 2. Telemetry inflation — An AbortController is created before the recall call and aborted when the 2.5s deadline fires. Its signal is passed into the recall pipeline via a new optional abortSignal field on ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls in recall.ts are gated behind !options.abortSignal?.aborted so discarded results don't emit success metrics. 3. Test coverage — Added a test for the !promise guard path (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream completes without calling recall or injecting memory content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(core): address PR QwenLM#3814 review findings for auto-memory recall Critical fixes: - Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so resolve() always runs even if onDeadline() throws, preventing deadlock - Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel → runSideQuery, combined with the existing 2s timeout via AbortSignal.any() so the model API call is cancelled when the 2.5s deadline fires - Abort and clear pendingRecallAbortController on all early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal) - Rename _pendingRecallAbortController → pendingRecallAbortController (inconsistent underscore prefix) Tests added: - client.test.ts: verify sendMessageStream completes normally when recall rejects - relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout, and timeout-only signal when no caller signal provided * fix(core): start auto-memory recall deadline at initiation time Moves the resolveAutoMemoryWithDeadline() race from consumption time to initiation time so the 2.5s budget is not consumed by intermediate work (microcompact, compression, token checks, IDE context) between recall initiation and consumption. The raced promise is stored directly in relevantAutoMemoryPromise and simply awaited at consumption. The pendingRecallAbortController cleanup on early return paths is preserved unchanged. * fix(core): address PR QwenLM#3814 review feedback - resetChat(): abort pendingRecallAbortController so stale in-flight recall does not leak into the next session. - recall.ts catch: distinguish AbortError (deadline-triggered cancellation) from real model errors. AbortError logs at debug level with a message indicating the heuristic result was discarded. Real model errors continue to log at warn with the existing fallback message. * fix(core): address PR QwenLM#3814 round 3 review feedback - Use explicit undefined check instead of non-null assertion for timer in resolveAutoMemoryWithDeadline (clearTimeout) - Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded results after the deadline don't produce output - Downgrade AbortError log level from warn to debug in the client.ts recall catch block, keeping the warn channel meaningful for real failures - Add tests verifying pendingRecallAbortController.abort() is called on MaxSessionTurns and SessionTokenLimitExceeded early-return paths Co-Authored-By: Claude <noreply@anthropic.com> * test(core): assert abort propagation in relevance selector * fix(lint): remove unused eslint-disable directive in relevanceSelector.test.ts The import/no-internal-modules eslint-disable comment was unnecessary — the rule does not flag same-package imports like ../utils/sideQuery.js. ESLint 9 reports it as an unused directive, failing the CI lint check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes #3759. The auto-memory recall side-query had a 5s AbortSignal.timeout that fired on every turn. Since the main request path awaited the full recall promise (including timeout + heuristic fallback), every user turn was delayed by ~5s.
Changes
Core fix
resolveAutoMemoryWithDeadline()that races the recall promise against a 2.5s deadline. If recall hasn't completed by then, proceed with an empty result so the main request is never blocked.AbortSignal.any([AbortSignal.timeout(2_000), callerAbortSignal])so the deadline propagates to the model API call.Abort lifecycle
AbortControlleraspendingRecallAbortControllerinstance field. The 2.5s deadline callsabort()on it, which propagates throughrecall()→selectRelevantAutoMemoryDocumentsByModel→runSideQuery, cancelling the in-flight LLM call.pendingRecallAbortControllerbefore returning, preventing background recall from running after the user-facing turn has already completed.resetChat()aborts and clearspendingRecallAbortControllerso a stale in-flight recall does not leak into the next session.AbortError(logs at debug) from real model errors (logs at warn), keeping the warn channel meaningful for oncall debugging.!abortSignal.abortedso discarded results after the deadline don't produce output.Race-at-initiation trade-off
The recall promise is raced against the deadline at initiation time (when
recall()is called), not at consumption time (when the result is awaited). This gives a consistent 2.5s recall budget regardless of how much intermediate work (microcompact, compression, token checks, IDE context) happens between initiation and consumption. The trade-off is that the recall no longer overlaps with that intermediate work — the total turn latency is slightly higher in the fast-recall case, but the worst-case latency is bounded.Telemetry
logMemoryRecallis gated on!abortSignal.abortedat all 3 call sites, so discarded results don't inflate recall-success metrics.Validation
npm run build— 0 errorsnpx vitest run packages/core/src/core/client.test.ts— 78 tests passnpx vitest run packages/core/src/memory/— 10 tests passRisk
Low. The change only affects the auto-memory recall path. When recall is fast (the common case), behavior is unchanged. When recall is slow, the main request proceeds without memory context instead of waiting. The abort cleanup on early-return paths and
resetChat()prevents resource leaks.