Skip to content

fix(core): prevent auto-memory recall from blocking main request#3814

Merged
wenshao merged 9 commits into
QwenLM:mainfrom
B-A-M-N:fix/auto-memory-recall-blocking-main-request
May 5, 2026
Merged

fix(core): prevent auto-memory recall from blocking main request#3814
wenshao merged 9 commits into
QwenLM:mainfrom
B-A-M-N:fix/auto-memory-recall-blocking-main-request

Conversation

@B-A-M-N

@B-A-M-N B-A-M-N commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #3759. The auto-memory recall side-query had a 5s AbortSignal.timeout that fired on every turn. Since the main request path awaited the full recall promise (including timeout + heuristic fallback), every user turn was delayed by ~5s.

Changes

Core fix

  • client.ts: Add resolveAutoMemoryWithDeadline() that races the recall promise against a 2.5s deadline. If recall hasn't completed by then, proceed with an empty result so the main request is never blocked.
  • relevanceSelector.ts: Reduce model-driven selector timeout from 5s to 2s. The selector is a simple ranking task that should complete quickly; 2s is still generous. When a caller abort signal is provided, combine both via AbortSignal.any([AbortSignal.timeout(2_000), callerAbortSignal]) so the deadline propagates to the model API call.

Abort lifecycle

  • client.ts: Store the AbortController as pendingRecallAbortController instance field. The 2.5s deadline calls abort() on it, which propagates through recall()selectRelevantAutoMemoryDocumentsByModelrunSideQuery, cancelling the in-flight LLM call.
  • client.ts: All 4 early-return paths (MaxSessionTurns, boundedTurns=0, SessionTokenLimitExceeded, Arena control signal) now abort and clear pendingRecallAbortController before returning, preventing background recall from running after the user-facing turn has already completed.
  • client.ts: resetChat() aborts and clears pendingRecallAbortController so a stale in-flight recall does not leak into the next session.
  • recall.ts: When the model-driven selector is aborted by the deadline, the catch block distinguishes AbortError (logs at debug) from real model errors (logs at warn), keeping the warn channel meaningful for oncall debugging.
  • recall.ts: Heuristic fallback is gated on !abortSignal.aborted so discarded results after the deadline don't produce output.

Race-at-initiation trade-off

The recall promise is raced against the deadline at initiation time (when recall() is called), not at consumption time (when the result is awaited). This gives a consistent 2.5s recall budget regardless of how much intermediate work (microcompact, compression, token checks, IDE context) happens between initiation and consumption. The trade-off is that the recall no longer overlaps with that intermediate work — the total turn latency is slightly higher in the fast-recall case, but the worst-case latency is bounded.

Telemetry

  • recall.ts: logMemoryRecall is gated on !abortSignal.aborted at all 3 call sites, so discarded results don't inflate recall-success metrics.

Validation

  • npm run build — 0 errors
  • npx vitest run packages/core/src/core/client.test.ts — 78 tests pass
  • npx vitest run packages/core/src/memory/ — 10 tests pass
  • Pre-commit hooks (prettier + eslint) pass

Risk

Low. The change only affects the auto-memory recall path. When recall is fast (the common case), behavior is unchanged. When recall is slow, the main request proceeds without memory context instead of waiting. The abort cleanup on early-return paths and resetChat() prevents resource leaks.

@B-A-M-N B-A-M-N force-pushed the fix/auto-memory-recall-blocking-main-request branch from bfeb9ce to 727e24f Compare May 3, 2026 18:35
@B-A-M-N

B-A-M-N commented May 3, 2026

Copy link
Copy Markdown
Contributor Author

Branch updated — this PR now only contains the #3759 fix (auto-memory recall blocking). The #3765 fix has been moved to a separate PR #3815.

@B-A-M-N B-A-M-N force-pushed the fix/auto-memory-recall-blocking-main-request branch from 727e24f to 03ce4fa Compare May 3, 2026 22:55
return EMPTY_RELEVANT_AUTO_MEMORY_RESULT;
}

const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] setTimeout in resolveAutoMemoryWithDeadline is never cleared via clearTimeout when the recall promise wins the race. The dangling timer can block the Node.js event loop from draining during graceful process exit — if process.exit() is called within 2.5s of the race resolving, the process will hang until the timer fires.

Suggested change
const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => {
async function resolveAutoMemoryWithDeadline(
promise: Promise<RelevantAutoMemoryPromptResult> | undefined,
): Promise<RelevantAutoMemoryPromptResult> {
if (!promise) {
return EMPTY_RELEVANT_AUTO_MEMORY_RESULT;
}
let timer: ReturnType<typeof setTimeout>;
const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => {
timer = setTimeout(
() => resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT),
2_500,
);
});
try {
return await Promise.race([promise, deadline]);
} finally {
clearTimeout(timer!);
}
}

— deepseek-v4-pro via Qwen Code /review

const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => {
setTimeout(() => resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT), 2_500);
});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Promise.race does not cancel the loser. When the 2.5s deadline wins, the recall promise continues executing in the background — including logMemoryRecall telemetry in recall.ts that emits a "successful recall" event for a result that was discarded. This inflates production recall-success metrics: oncall engineers see green dashboards while users actually received no memory context.

Consider passing a unified AbortSignal into the recall() pipeline so that when the deadline fires, the entire recall chain (scan → model call → heuristic fallback → telemetry) can abort early. At minimum, gate the logMemoryRecall call behind a signal check so discarded results are not counted.

— deepseek-v4-pro via Qwen Code /review

async function resolveAutoMemoryWithDeadline(
promise: Promise<RelevantAutoMemoryPromptResult> | undefined,
): Promise<RelevantAutoMemoryPromptResult> {
if (!promise) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The early return path if (!promise) return EMPTY_RELEVANT_AUTO_MEMORY_RESULT (triggered when getManagedAutoMemoryEnabled() is false) has no test coverage. The two new tests cover the deadline-hit and fast-completion paths, but the undefined-promise guard is untested.

Consider adding a test where config.getManagedAutoMemoryEnabled() returns false, then verify sendMessageStream completes normally without auto-memory content injected.

— deepseek-v4-pro via Qwen Code /review

B-A-M-N added a commit to B-A-M-N/qwen-code that referenced this pull request May 4, 2026
…ll deadline

Three changes from the review:

1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is
   now cleared in a finally block so the dangling timer doesn't block the
   Node.js event loop from draining during graceful process exit.

2. Telemetry inflation — An AbortController is created before the recall
   call and aborted when the 2.5s deadline fires. Its signal is passed into
   the recall pipeline via a new optional abortSignal field on
   ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls
   in recall.ts are gated behind !options.abortSignal?.aborted so discarded
   results don't emit success metrics.

3. Test coverage — Added a test for the !promise guard path
   (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream
   completes without calling recall or injecting memory content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@B-A-M-N

B-A-M-N commented May 4, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review — all three items have been addressed.

`setTimeout` leak in `resolveAutoMemoryWithDeadline` — The dangling timer is now cleared via `clearTimeout(timer)` in a `finally` block, so when the recall promise wins the race, the process can drain the event loop normally during graceful exit.

Recall continues in background / telemetry inflation — Added an `AbortController` at the call site that gets aborted when the 2.5s deadline fires. The controller's signal is passed into the `recall()` pipeline via a new optional `abortSignal` field on `ResolveRelevantAutoMemoryPromptOptions`. All three `logMemoryRecall` calls in `recall.ts` are now gated behind `!options.abortSignal?.aborted`, so discarded results don't inflate recall-success metrics.

Missing test for `!promise` guard — Added a test (`should proceed without auto-memory when managed auto-memory is disabled`) that mocks `getManagedAutoMemoryEnabled()` returning `false`, then verifies that `recall()` is never called and the main request proceeds without memory content.

Validation:

  • `npm run build` — 0 errors
  • `npx vitest run packages/core/src/core/client.test.ts` — 66 tests passed
  • `npx vitest run packages/core/src/memory/` — 59 tests passed
  • `npx vitest run packages/core/src/core/geminiChat.test.ts` — 46 tests passed

Comment thread packages/core/src/core/client.ts Outdated
let timer: ReturnType<typeof setTimeout>;
const deadline = new Promise<RelevantAutoMemoryPromptResult>((resolve) => {
timer = setTimeout(() => {
onDeadline();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] In resolveAutoMemoryWithDeadline, onDeadline() is called before resolve() in the setTimeout callback. If onDeadline throws (e.g., a rare edge case in AbortController.abort()), resolve() never executes. The deadline promise stays pending forever, Promise.race hangs, and sendMessageStream deadlocks at the await.

Suggested change
onDeadline();
timer = setTimeout(() => {
try {
onDeadline();
} finally {
resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT);
}
}, 2_500);

— deepseek-v4-pro via Qwen Code /review

Comment thread packages/core/src/core/client.ts Outdated
private lastPromptId: string | undefined = undefined;
private lastSentIdeContext: IdeContext | undefined;
private forceFullIdeContext = true;
private _pendingRecallAbortController: AbortController | undefined;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] _pendingRecallAbortController is leaked on 4 early return paths (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal). The controller is set at line 710 but never aborted or cleared before these returns between lines 748-815. On the next sendMessageStream call, the stale controller is overwritten, and the background recall runs to completion with misleading telemetry.

Also note: the _ prefix is inconsistent with all other private fields in GeminiClient (chat, sessionTurnCount, lastPromptId, etc.). Consider pendingRecallAbortController without the underscore.

Suggested change
private _pendingRecallAbortController: AbortController | undefined;
// Wrap the recall-initiation-to-consumption span in try/finally:
// At line 710 (after `this._pendingRecallAbortController = recallAbortController`),
// ensure all early returns and exceptions clean up:
try {
// ... existing logic ...
} finally {
this._pendingRecallAbortController?.abort();
this._pendingRecallAbortController = undefined;
}

— deepseek-v4-pro via Qwen Code /review

@@ -131,6 +131,8 @@ export interface ResolveRelevantAutoMemoryPromptOptions {
excludedFilePaths?: Iterable<string>;
limit?: number;
recentTools?: readonly string[];

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] The abortSignal added here is passed into recall() but never forwarded to selectRelevantAutoMemoryDocumentsByModel. The model selector creates its own independent AbortSignal.timeout(2_000) in relevanceSelector.ts:90, completely decoupled from this external abort signal. When the 2.5s deadline fires, the model API call keeps running — consuming quota for a discarded result. The runSideQuery in relevanceSelector.ts already accepts an abortSignal parameter; the plumbing exists but isn't connected.

Suggested change
recentTools?: readonly string[];
// In recall.ts, pass abortSignal to the model selector:
const selectedDocs = await selectRelevantAutoMemoryDocumentsByModel(
options.config,
query,
docs,
limit,
options.recentTools,
options.abortSignal, // forward the abort signal
);
// In relevanceSelector.ts, combine with existing timeout:
abortSignal: AbortSignal.any([
AbortSignal.timeout(2_000),
callerAbortSignal,
].filter(Boolean))

— deepseek-v4-pro via Qwen Code /review

excludedFilePaths: this.surfacedRelevantAutoMemoryPaths,
abortSignal: recallAbortController.signal,
})
.catch((error: unknown) => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The .catch() error handler that swallows errors and returns EMPTY_RELEVANT_AUTO_MEMORY_RESULT has zero test coverage. No test simulates recall() rejection to verify the main request still completes. If this handler regresses, a failed recall becomes an unhandled promise rejection.

Suggested change
.catch((error: unknown) => {
// Add test in client.test.ts:
it('should proceed normally when recall rejects', async () => {
mockMemoryManager.recall.mockRejectedValue(new Error('recall failed'));
const stream = client.sendMessageStream(
[{ text: 'Quick question' }],
new AbortController().signal,
'prompt-id-recall-fail',
);
for await (const _ of stream) { /* consume */ }
expect(mockTurnRunFn).toHaveBeenCalledWith(
'test-model',
expect.not.arrayContaining([expect.stringContaining('Relevant memory')]),
expect.any(AbortSignal),
);
});

— deepseek-v4-pro via Qwen Code /review

Comment thread packages/core/src/core/client.ts Outdated
const relevantAutoMemory = relevantAutoMemoryPromise
? await relevantAutoMemoryPromise
: EMPTY_RELEVANT_AUTO_MEMORY_RESULT;
const recallAbortController = this._pendingRecallAbortController;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The deadline timer in resolveAutoMemoryWithDeadline starts when the function is called here (consumption time), not when the recall was initiated at line ~692. If the intermediate code (recording, microcompact, compression, token check, IDE context) takes significant time, the effective deadline shrinks, potentially leaving no time for the heuristic fallback. Consider racing the promise with the deadline at initiation time and storing the raced result instead.

Suggested change
const recallAbortController = this._pendingRecallAbortController;
// At initiation (line ~695):
const deadlineResult = resolveAutoMemoryWithDeadline(
recallPromise,
() => recallAbort.abort(),
);
// Store deadlineResult and consume it directly here.

— deepseek-v4-pro via Qwen Code /review

B-A-M-N added a commit to B-A-M-N/qwen-code that referenced this pull request May 4, 2026
Critical fixes:
- Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so
  resolve() always runs even if onDeadline() throws, preventing deadlock
- Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel
  → runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  so the model API call is cancelled when the 2.5s deadline fires
- Abort and clear pendingRecallAbortController on all early return paths
  (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal)
- Rename _pendingRecallAbortController → pendingRecallAbortController
  (inconsistent underscore prefix)

Tests added:
- client.test.ts: verify sendMessageStream completes normally when recall rejects
- relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout,
  and timeout-only signal when no caller signal provided
@B-A-M-N

B-A-M-N commented May 4, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review — all concerns have been addressed.

Deadline timer moved from consumption to initiation time:
The resolveAutoMemoryWithDeadline call now wraps the recall promise at initiation time rather than at consumption time. Previously, the 2.5s deadline timer started after intermediate processing (recording, microcompact, compression, token limit check, IDE context, arena check), which could shrink the effective time available to the recall heuristic fallback. Starting the race at initiation guarantees the recall gets the full 2.5s regardless of processing time.

Previously addressed (from earlier commits):

  • onDeadline() wrapped in try/finally so resolve() always fires, preventing deadlock
  • pendingRecallAbortController cleaned up on all 4 early-return paths with ?.abort() + = undefined
  • abortSignal forwarded from recall() through selectRelevantAutoMemoryDocumentsByModel to runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  • logMemoryRecall gated behind !options.abortSignal?.aborted to prevent telemetry inflation from discarded results
  • Test added for the !promise guard path (getManagedAutoMemoryEnabled() returning false)
  • Test added for recall rejection (.catch() error handler)

Validation:

  • npx vitest run packages/core/src/core/client.test.ts — 67 tests passed
  • npx vitest run packages/core/src/memory/ — 61 tests passed
  • npm run build — 0 errors

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (not tied to specific diff lines)

[Suggestion] resetChat() (line ~269) clears surfacedRelevantAutoMemoryPaths but does not abort/clear pendingRecallAbortController. If resetChat() is called during an in-flight recall, the stale controller persists into the next session. Consider adding this.pendingRecallAbortController?.abort(); this.pendingRecallAbortController = undefined; in resetChat().

[Suggestion] The catch block log in recall.ts (~line 221) says "Model-driven auto-memory recall failed; falling back to heuristic selection" — but when the deadline triggers an abort, the model didn't fail, it was cancelled. The heuristic result is also discarded (deadline already resolved). Consider distinguishing AbortError from real model errors to avoid misleading oncall debugging.

— deepseek-v4-pro via Qwen Code /review

onDeadline();
} finally {
resolve(EMPTY_RELEVANT_AUTO_MEMORY_RESULT);
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] When the 2.5s deadline fires, the system silently proceeds without memory context — no warning log, no telemetry event, no observable signal. Oncall engineers cannot detect recall degradation from logs or monitoring.

Consider adding a debugLogger.warn and a recall_timed_out telemetry event before the deadline resolves with EMPTY_RELEVANT_AUTO_MEMORY_RESULT.

— deepseek-v4-pro via Qwen Code /review

private lastPromptId: string | undefined = undefined;
private lastSentIdeContext: IdeContext | undefined;
private forceFullIdeContext = true;
private pendingRecallAbortController: AbortController | undefined;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] pendingRecallAbortController is stored as an instance field. If sendMessageStream is ever called concurrently (no mutex guard currently), a second call would overwrite the field, causing the first call's early-return cleanup to abort the wrong controller. The onDeadline closure captures the correct controller (asymmetric correctness).

Consider using a local variable within sendMessageStream instead of an instance field, so each invocation has its own controller.

— deepseek-v4-pro via Qwen Code /review

this.config.getMaxSessionTurns() > 0 &&
this.sessionTurnCount > this.config.getMaxSessionTurns()
) {
this.pendingRecallAbortController?.abort();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The 4 early-return paths (MaxSessionTurns, boundedTurns === 0, SessionTokenLimitExceeded, Arena control signal) each clean up pendingRecallAbortController with ?.abort() + = undefined, but none of these paths have test coverage verifying the abort actually happens. If the cleanup is accidentally removed or reordered, it would not be caught.

Consider adding tests for each early-return path that assert the abort signal passed to mockMemoryManager.recall is in the aborted state after the return.

— deepseek-v4-pro via Qwen Code /review

excludedFilePaths?: Iterable<string>;
limit?: number;
recentTools?: readonly string[];
/** When provided and aborted, suppresses logMemoryRecall telemetry for discarded results. */

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The JSDoc for abortSignal says "suppresses logMemoryRecall telemetry for discarded results" — but the signal's primary role is to propagate the caller's deadline abort to the model-driven selector so the LLM call is actually cancelled. The telemetry suppression is a side effect, not the main purpose.

Update the JSDoc to document both roles: abort propagation (primary) and telemetry suppression (secondary).

— deepseek-v4-pro via Qwen Code /review

@@ -131,6 +131,8 @@ export interface ResolveRelevantAutoMemoryPromptOptions {
excludedFilePaths?: Iterable<string>;
limit?: number;
recentTools?: readonly string[];

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] resolveRelevantAutoMemoryPromptForQuery has zero unit tests — packages/core/src/memory/ contains no test file for recall.ts. The new abortSignal parameter's conditional suppression of logMemoryRecall at 3 sites (lines 173, 204, 232) is only exercised indirectly through client.test.ts where mockMemoryManager.recall bypasses the real function.

Consider adding packages/core/src/memory/recall.test.ts with tests that pass an already-aborted AbortSignal and assert logMemoryRecall is not called.

— deepseek-v4-pro via Qwen Code /review

expect(runSideQuery).not.toHaveBeenCalled();
});

it('forwards caller abort signal to runSideQuery combined with timeout', async () => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The two new abort signal tests only assert passedSignal is instanceof AbortSignal, but never verify that when the caller signal is aborted, the combined AbortSignal.any([...]) signal is also in the aborted state. The core signal propagation logic is untested.

Add a test: create an AbortController, call .abort(), pass its signal as callerAbortSignal, then assert passedSignal.aborted === true.

— deepseek-v4-pro via Qwen Code /review

);
});

it('should not block the main request when auto-memory recall is slow', async () => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The slow-recall test verifies that memory content is excluded from the main request (the result), but does not verify that recallAbortController.abort() was actually called (the mechanism). If the onDeadline callback is refactored away, the test still passes while production silently wastes resources.

Consider capturing the AbortSignal passed to mockMemoryManager.recall, advancing timers past the deadline, and asserting capturedSignal.aborted === true.

— deepseek-v4-pro via Qwen Code /review

contents,
schema: RESPONSE_SCHEMA,
abortSignal: AbortSignal.timeout(5_000),
abortSignal: callerAbortSignal

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The model-driven selector timeout was reduced from 5s to 2s, and the 2.5s client deadline is hardcoded in a separate file with no shared constant or invariant check. If someone adjusts the model timeout above 2.5s, the deadline always wins, permanently disabling model-driven selection with no warning.

Consider: (1) keeping the standalone timeout at 5s when no callerAbortSignal is present (only use 2s inside AbortSignal.any); (2) extracting both timeout values to a shared constant or adding a runtime invariant(modelTimeoutMs < deadlineMs) check.

— deepseek-v4-pro via Qwen Code /review

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 3 review

The abort plumbing looks solid now — signal threads cleanly from the deadline through recall()selectRelevantAutoMemoryDocumentsByModelrunSideQuery, the four early-return paths all clear pendingRecallAbortController, and logMemoryRecall is gated on !aborted so discarded results don't pollute success metrics. The try/finally around onDeadline() and the move of the race to initiation time both look right. Net-net, this is in much better shape than round 1.

One thing I'd like to see before approving:

Missing test for early-return abort cleanup

The four pendingRecallAbortController?.abort() cleanups added in 1edd7fdd (MaxSessionTurns / boundedTurns=0 / SessionTokenLimitExceeded / arena cancel) are net-new behavior with zero test coverage. If a future refactor accidentally drops one of these, recall will keep running in the background after the user-facing turn already returned — exactly the silent-degradation class of bug this PR set out to fix. Please add at least one direct test.

Suggested starting point, modeled on the existing should yield MaxSessionTurns and stop when session turn limit is reached test:

it('should abort the pending recall when MaxSessionTurns is hit', async () => {
  vi.spyOn(client['config'], 'getMaxSessionTurns').mockReturnValue(1);
  client['sessionTurnCount'] = 1; // already at limit; next call exceeds it

  const recallAbortHandler = vi.fn();
  mockMemoryManager.recall.mockImplementation((_root, _query, opts) => {
    opts.abortSignal?.addEventListener('abort', recallAbortHandler);
    return new Promise(() => {}); // never resolves
  });

  const mockStream = (async function* () {
    yield { type: 'content', value: 'Hello' };
  })();
  mockTurnRunFn.mockReturnValue(mockStream);

  const mockChat: Partial<GeminiChat> = {
    addHistory: vi.fn(),
    getHistory: vi.fn().mockReturnValue([]),
  };
  client['chat'] = mockChat as GeminiChat;

  const stream = client.sendMessageStream(
    [{ text: 'over the limit' }],
    new AbortController().signal,
    'prompt-id-over-limit',
  );
  const events = [];
  for await (const event of stream) {
    events.push(event);
  }

  expect(events).toEqual([{ type: GeminiEventType.MaxSessionTurns }]);
  expect(recallAbortHandler).toHaveBeenCalledTimes(1);
});

Same shape works for SessionTokenLimitExceeded and the arena-control path; boundedTurns=0 is low priority (only triggers when turns=0 is passed in explicitly), so one or two of these is enough.

Non-blocking suggestions (your call, fine as follow-ups)

  1. clearTimeout(timer!) non-null assertion is a bit fragile — it relies on the implicit "Promise constructor runs executor synchronously" contract. Safer:

    let timer: ReturnType<typeof setTimeout> | undefined;
    ...
    finally { if (timer !== undefined) clearTimeout(timer); }
  2. recall.ts heuristic fallback ignores abort — when the model selector aborts, the catch falls through to the synchronous heuristic, which doesn't check abortSignal.aborted. The work itself is fast and in-memory so this is harmless, but strictly it shouldn't produce results after the deadline. One-line guard at the top of the catch block.

  3. Log level for expected abortsdebugLogger.warn('Managed auto-memory recall prefetch failed.', error) will fire for every deadline-triggered abort, which is the designed path on a slow setup. Consider downgrading AbortError to debug so the warn channel stays meaningful:

    .catch((error) => {
      if ((error as { name?: string })?.name === 'AbortError') {
        debugLogger.debug('Auto-memory recall aborted by deadline.');
      } else {
        debugLogger.warn('Managed auto-memory recall prefetch failed.', error);
      }
      return EMPTY_RELEVANT_AUTO_MEMORY_RESULT;
    });
  4. PR description — the move from race-at-consumption to race-at-initiation in 857a306c is a deliberate trade-off (consistent recall budget vs. lost overlap with intermediate work). Worth a sentence in the PR description so future readers don't have to dig through commits.

  5. Follow-up idea, not for this PR — issue #3759 also suggested routing the recall selector to fastModel. Orthogonal to this PR; with a fast model behind it the deadline could probably drop to ~1s. Worth a follow-up issue if you don't already have one.

Once the early-return abort test (item above) lands I'll approve.

B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.
@B-A-M-N

B-A-M-N commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

@wenshao (via deepseek-v4-pro) — Both suggestions addressed:

1. resetChat() not aborting pendingRecallAbortController — Fixed. Now aborts and clears the controller before calling startChat(), so a stale in-flight recall from the previous session does not leak into the next one.

2. Misleading catch-block log in recall.ts — Fixed. Now distinguishes AbortError (deadline-triggered cancellation) from real model errors:

  • AbortErrordebugLogger.debug()" Model-driven auto-memory recall cancelled by deadline; heuristic result discarded."
  • Real errors → debugLogger.warn() with the existing "falling back to heuristic selection" message

This way an oncall engineer reading logs can tell whether the recall was cancelled by the deadline (normal) vs an actual model failure (investigate).

Validation: 128/128 tests pass across 16 test files (client.test.ts + memory/). Build clean.

Commit: 167970efe

@B-A-M-N

B-A-M-N commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

Both suggestions from the review have been addressed and pushed (commit 167970e). 128/128 tests pass.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick nudge — 167970efe looks good but I think it crossed wires with my round-3 review.

Acknowledged from this commit:

  • resetChat() now aborts + clears pendingRecallAbortController
  • recall.ts catch distinguishes AbortError (debug) from real model errors (warn) ✓ — this is exactly my round-3 item #3, and the error instanceof DOMException && error.name === 'AbortError' check is actually narrower / better than the .name-only check I suggested

Still outstanding from my round-3 review (the CHANGES_REQUESTED one I posted at 02:04 UTC, separate from the deepseek-v4-pro one you replied to):

🔴 #1 — Missing test for the four pendingRecallAbortController?.abort() early-return paths added in 1edd7fdd (MaxSessionTurns / boundedTurns=0 / SessionTokenLimitExceeded / arena cancel). I posted a copy-pasteable test sketch in that review — modeled on the existing should yield MaxSessionTurns and stop when session turn limit is reached test, uses mockMemoryManager.recall.mockImplementation to register an abort listener, asserts recallAbortHandler fires once. This is the only blocker — my round-3 review explicitly said "Once the early-return abort test lands I'll approve."

The other items in that review (#2 clearTimeout(timer!) non-null, #4 PR description, #5 fastModel follow-up) are all optional / nice-to-have and don't gate merge.

Also FYI: the branch is now mergeable: CONFLICTING — main moved while you were iterating. Will need a rebase before merge.

CI also hasn't run on 167970efe yet (likely held by the conflict state).

B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>
B-A-M-N added a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
…ll deadline

Three changes from the review:

1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is
   now cleared in a finally block so the dangling timer doesn't block the
   Node.js event loop from draining during graceful process exit.

2. Telemetry inflation — An AbortController is created before the recall
   call and aborted when the 2.5s deadline fires. Its signal is passed into
   the recall pipeline via a new optional abortSignal field on
   ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls
   in recall.ts are gated behind !options.abortSignal?.aborted so discarded
   results don't emit success metrics.

3. Test coverage — Added a test for the !promise guard path
   (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream
   completes without calling recall or injecting memory content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
B-A-M-N added a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
Critical fixes:
- Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so
  resolve() always runs even if onDeadline() throws, preventing deadlock
- Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel
  → runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  so the model API call is cancelled when the 2.5s deadline fires
- Abort and clear pendingRecallAbortController on all early return paths
  (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal)
- Rename _pendingRecallAbortController → pendingRecallAbortController
  (inconsistent underscore prefix)

Tests added:
- client.test.ts: verify sendMessageStream completes normally when recall rejects
- relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout,
  and timeout-only signal when no caller signal provided
B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.
@B-A-M-N B-A-M-N force-pushed the fix/auto-memory-recall-blocking-main-request branch from 8ed1193 to 60238c7 Compare May 5, 2026 06:52
B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4 review

The 4 early-return abort tests + arena/heuristic abort plumbing all look right now on 60238c7b1; CI is green across all 11 platforms. No blockers — items below are non-blocking suggestions / latent risks worth filing as follow-ups if not addressed inline.

Non-blocking suggestions

1. relevanceSelector.test.ts abort-forwarding test asserts type, not behavior.

The current assertion expect(passedSignal).toBeInstanceOf(AbortSignal) would still pass if the code regressed to abortSignal: AbortSignal.timeout(2_000) without the AbortSignal.any(...) combine — both branches return an AbortSignal. The contract this test is supposed to lock in ("aborting the caller signal aborts the combined signal forwarded to runSideQuery") has no regression protection. Suggested:

const callerController = new AbortController();
let capturedSignal: AbortSignal | undefined;
vi.mocked(runSideQuery).mockImplementation(async (_config, opts) => {
  capturedSignal = opts.abortSignal;
  return { selected_memories: [] };
});

await selectRelevantAutoMemoryDocumentsByModel(
  mockConfig, 'check preferences', docs, 2, [], callerController.signal,
);

callerController.abort();
expect(capturedSignal!.aborted).toBe(true);

2. arena-cancel early-return abort path has no direct test.

Same shape as the new MaxSessionTurns / SessionTokenLimitExceeded tests, just with an arenaAgentClient mock returning a control signal. Three of the four early-return paths added in 1edd7fdd are now covered; the arena one is the odd one out. Skip is fine if you'd rather follow-up, but the round-3 ask was about all four.

3. client.ts:918-920 comment is slightly misleading.

// The recall promise was already raced against the 2.5s deadline at
// initiation time; this await just collects the result.
this.pendingRecallAbortController = undefined;

The comment explains why we can await directly, but not why we can clear the controller before the await — a future reader will worry the in-flight recall loses its kill-switch. Suggested addendum: "The deadline timer keeps running independently inside `resolveAutoMemoryWithDeadline`; clearing the field here is safe because no further early-return path can fire after this point."

4. Concurrent `sendMessageStream` calls would race on `pendingRecallAbortController` (latent).

Because the field is single-valued, if a UserQuery is mid-flight and a Notification call enters and hits MaxSessionTurns, the Notification will `?.abort()` the UserQuery's controller — Notification doesn't initialize a new controller (it's gated on `UserQuery || Cron`), so it sees the stale field. Mitigated in practice by the deadline timer + the CLI processing one stream at a time, but the code doesn't enforce this. If you want to harden, switch to `Set` (add on init, delete on consumption / abort). Fine as a follow-up.

5. Magic numbers `2_500` and `2_000`.

These two constants are semantically linked (deadline = model timeout + 0.5s heuristic-fallback budget) but live in different files. A pair of named constants with JSDoc explaining the relationship would help readers — and makes the eventual fast-model retune (#3759 follow-up) a single-line change.

Round-3 items confirmed addressed in this commit (60238c7b1)

  • All 4 early-return paths abort + null the controller (3 of 4 now have direct tests)
  • `resetChat()` aborts in-flight recall
  • `clearTimeout(timer)` uses explicit `undefined` check, no non-null assertion
  • `recall.ts` catch distinguishes `AbortError` (debug) from real failures (warn)
  • `client.ts` recall `.catch` does the same
  • `logMemoryRecall` gated on `!options.abortSignal?.aborted` in all three call sites
  • Heuristic fallback skipped when caller abort already set
  • Deadline race moved from consumption-time to initiation-time

LGTM modulo the suggestions above — happy to approve once #1 (real abort-propagation assertion) lands. Others can be follow-ups.

B-A-M-N and others added 4 commits May 5, 2026 13:34
QwenLM#3759)

The auto-memory recall side-query had a 5s AbortSignal.timeout that fired
on every turn, and the main request path awaited the full recall promise
(including timeout + heuristic fallback). This caused a ~5s delay on every
user turn.

Changes:
- relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s
- client.ts: add resolveAutoMemoryWithDeadline() that races the recall
  promise against a 2.5s deadline, returning empty result if recall
  hasn't completed in time
- client.test.ts: add 2 tests verifying slow recall doesn't block the
  main request and fast recall still includes memory content

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
stripThoughtsFromHistory was removed from GeminiChat on main.
The mock in client.test.ts still referenced it, causing TS2353 build failures in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ll deadline

Three changes from the review:

1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is
   now cleared in a finally block so the dangling timer doesn't block the
   Node.js event loop from draining during graceful process exit.

2. Telemetry inflation — An AbortController is created before the recall
   call and aborted when the 2.5s deadline fires. Its signal is passed into
   the recall pipeline via a new optional abortSignal field on
   ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls
   in recall.ts are gated behind !options.abortSignal?.aborted so discarded
   results don't emit success metrics.

3. Test coverage — Added a test for the !promise guard path
   (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream
   completes without calling recall or injecting memory content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Critical fixes:
- Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so
  resolve() always runs even if onDeadline() throws, preventing deadlock
- Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel
  → runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  so the model API call is cancelled when the 2.5s deadline fires
- Abort and clear pendingRecallAbortController on all early return paths
  (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal)
- Rename _pendingRecallAbortController → pendingRecallAbortController
  (inconsistent underscore prefix)

Tests added:
- client.test.ts: verify sendMessageStream completes normally when recall rejects
- relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout,
  and timeout-only signal when no caller signal provided
@wenshao

wenshao commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Thanks — 60238c7b1 lands all the round-3 items cleanly:

  • Early-return abort cleanup tests — both 'should abort the pending recall when MaxSessionTurns is hit' and 'should abort the pending recall when SessionTokenLimitExceeded' are in. (Bonus: you covered the SessionTokenLimit path too — I'd only sketched MaxSessionTurns.)
  • clearTimeout(timer) switched to let timer: ... | undefined; + explicit undefined check — no more non-null assertion.
  • recall.ts heuristic fallback now gated on abortSignal.aborted so discarded results after the deadline don't produce output.
  • ✅ AbortError log downgraded to debug in the recall .catch, warn channel stays meaningful for real failures.
  • ✅ CI 13/13 green.

Only thing left: branch is CONFLICTING against main again — looks like you rebased once already (I can see 2c93fd670 and friends in the ancestry), but main moved further while you were iterating. Please pull/rebase one more time and I'll approve as soon as CI comes back green on the new base. Round-3 #4 (PR description rationale for race-at-initiation) and #5 (fastModel routing follow-up issue) are still optional non-blockers — your call whether to fold them in or leave for follow-up.

B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.
B-A-M-N pushed a commit to B-A-M-N/qwen-code that referenced this pull request May 5, 2026
- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>
@B-A-M-N B-A-M-N force-pushed the fix/auto-memory-recall-blocking-main-request branch from 60238c7 to d1fe01d Compare May 5, 2026 18:52
@B-A-M-N

B-A-M-N commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

Commit: d1fe01d8552b7ebcd2880d93a33920927543728c (d1fe01d85)

Thanks for the thorough round 4 review — the abort-propagation test fix is exactly right.

Abort-propagation test now asserts behavior, not type:

The test in relevanceSelector.test.ts previously asserted expect(passedSignal).toBeInstanceOf(AbortSignal), which would pass even if the code regressed to a timeout-only signal. The updated test now:

  1. Captures the actual AbortSignal passed to runSideQuery via mockImplementation
  2. Verifies it's defined and not initially aborted
  3. Aborts the caller controller
  4. Uses vi.waitFor to assert the captured (combined) signal becomes aborted

This directly tests the AbortSignal.any() contract — aborting the caller signal must propagate to the combined signal forwarded to runSideQuery.

ESLint fix: Removed the Config type import (was flagged by import/no-internal-modules) and derived the config type via Parameters<typeof selectRelevantAutoMemoryDocumentsByModel>[0]. Added an eslint-disable comment for the sideQuery import with a justification — the test must mock the exact module path that relevanceSelector.ts imports.

Rebase: Branch rebased onto latest origin/main (commit c7facf501). One conflict in client.ts resolved — kept both the perModelGeneratorCache.clear() from main and the pendingRecallAbortController cleanup from this PR.

Root cause of pre-existing test failures: Stale compiled .js files in src/ (compiled at 12:08 before TS sources were updated) were being resolved by vitest instead of the .ts sources. Removing these caused all 86 client tests and 117 memory tests to pass.

Validation:

  • npx vitest run packages/core/src/memory/relevanceSelector.test.ts — 5 passed
  • npx vitest run packages/core/src/core/client.test.ts — 86 passed
  • npx vitest run packages/core/src/memory/ — 117 passed (29 files)

B-A-M-N and others added 4 commits May 5, 2026 16:28
Moves the resolveAutoMemoryWithDeadline() race from consumption time to
initiation time so the 2.5s budget is not consumed by intermediate work
(microcompact, compression, token checks, IDE context) between recall
initiation and consumption. The raced promise is stored directly in
relevantAutoMemoryPromise and simply awaited at consumption.

The pendingRecallAbortController cleanup on early return paths is
preserved unchanged.
- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.
- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>
@B-A-M-N B-A-M-N force-pushed the fix/auto-memory-recall-blocking-main-request branch from d1fe01d to 14ae8d6 Compare May 5, 2026 21:31
…r.test.ts

The import/no-internal-modules eslint-disable comment was unnecessary —
the rule does not flag same-package imports like ../utils/sideQuery.js.
ESLint 9 reports it as an unused directive, failing the CI lint check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wenshao

wenshao commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Local E2E verification

Reproduced the bug and confirmed the fix end-to-end against a real deepseek-v4-pro setup.

Setup

  • Built c7facf501 (current main, no fix) and the PR head c53ca481d into isolated worktrees, with @qwen-code/* workspace symlinks pointing at each worktree's own packages/. Without that, both runs resolve to the same shared node_modules and end up exercising the same code — easy to miss.
  • Memory dir at ~/.qwen/projects/-tmp-qwen-repro-3759/memory/ populated with 5 sample entries.
  • Working dir /tmp/qwen-repro-3759, model deepseek-v4-pro, prompt "Reply with just OK", 3 runs per build.

Wall-clock

Build Run 1 Run 2 Run 3 Median
main c7facf501 (no fix) 13.57s 9.32s 11.19s 11.19s
PR c53ca481d (with fix) 10.82s 7.45s 7.61s 7.61s

~2.6s per turn, ~23% faster — matches the expected gap from cutting recall blocking from ~5s to the 2.5s deadline.

Debug-log timing (from ~/.qwen/debug/<session>.txt)

Build init → recall abort warn Notes
PR run 1 2.118s 2s selector timeout fires deterministically
PR run 2 2.119s same
PR run 3 2.119s same
main run 1 no warn selector returned within 5s; main path still synchronously waited ~3–5s
main run 2 no warn same
main run 3 no warn same

Useful confirmation: the bug isn't strictly "selector hits the 5s timeout every turn." On deepseek-v4-pro the selector often completes in 3–5s, but the main path waits for it regardless. That matches the "delayed by ~5s − prepDuration" framing in #3759, and the PR's choice to race a deadline at initiation (rather than just shrinking the selector timeout) is the right shape of fix.

One observation worth a follow-up (non-blocking)

recall.ts:224 distinguishes AbortError from real model errors with error instanceof DOMException && error.name === 'AbortError'. In the real abort path this branch never fires — the OpenAI client wraps the abort as a plain Error with message "Request was aborted.", so every abort falls through to the debugLogger.warn(...) path on line 230. Functional fix is unaffected (recall still aborts, main path is still unblocked), but the log-differentiation introduced for oncall debugging doesn't actually differentiate in practice. Worth a small follow-up — match on the wrapped message, or detect at a layer where the abort hasn't been re-wrapped.

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Reviewed across 4 rounds, all blocking items addressed; local E2E reproduction confirms the ~2.5s/turn improvement (see comment above).

@wenshao wenshao merged commit 2a5be0d into QwenLM:main May 5, 2026
13 checks passed
TaimoorSiddiquiOfficial pushed a commit to TaimoorSiddiquiOfficial/HopCode that referenced this pull request May 6, 2026
…nLM#3836)

- fix(core): prevent auto-memory recall from blocking main request (QwenLM#3814)
- feat(core,cli): surface and cancel auto-memory dream tasks (QwenLM#3836)
- fix(sdk-python): standardize TAG_PREFIX to include v suffix (QwenLM#3832)

Conflict resolutions:
- Footer.tsx: removed stale dreamRunning/configInitMessage (not used in render)
- BackgroundTaskViewContext.tsx: kept @hoptrendy/hopcode-core, added createDebugLogger
- manager.test.ts: included upstream cancelTask + subscribe filter tests,
  replaced QWEN_CODE_MEMORY_LOCAL with HOPCODE_MEMORY_LOCAL
- get-release-version.js: kept hopcode-sdk brand, adopted upstream v prefix fix

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DragonnZhang pushed a commit that referenced this pull request May 8, 2026
* fix(core): prevent auto-memory recall from blocking main request (issue #3759)

The auto-memory recall side-query had a 5s AbortSignal.timeout that fired
on every turn, and the main request path awaited the full recall promise
(including timeout + heuristic fallback). This caused a ~5s delay on every
user turn.

Changes:
- relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s
- client.ts: add resolveAutoMemoryWithDeadline() that races the recall
  promise against a 2.5s deadline, returning empty result if recall
  hasn't completed in time
- client.test.ts: add 2 tests verifying slow recall doesn't block the
  main request and fast recall still includes memory content

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): remove stripThoughtsFromHistory from GeminiChat mocks

stripThoughtsFromHistory was removed from GeminiChat on main.
The mock in client.test.ts still referenced it, causing TS2353 build failures in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(core): address PR #3814 review comments on auto-memory recall deadline

Three changes from the review:

1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is
   now cleared in a finally block so the dangling timer doesn't block the
   Node.js event loop from draining during graceful process exit.

2. Telemetry inflation — An AbortController is created before the recall
   call and aborted when the 2.5s deadline fires. Its signal is passed into
   the recall pipeline via a new optional abortSignal field on
   ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls
   in recall.ts are gated behind !options.abortSignal?.aborted so discarded
   results don't emit success metrics.

3. Test coverage — Added a test for the !promise guard path
   (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream
   completes without calling recall or injecting memory content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(core): address PR #3814 review findings for auto-memory recall

Critical fixes:
- Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so
  resolve() always runs even if onDeadline() throws, preventing deadlock
- Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel
  → runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  so the model API call is cancelled when the 2.5s deadline fires
- Abort and clear pendingRecallAbortController on all early return paths
  (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal)
- Rename _pendingRecallAbortController → pendingRecallAbortController
  (inconsistent underscore prefix)

Tests added:
- client.test.ts: verify sendMessageStream completes normally when recall rejects
- relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout,
  and timeout-only signal when no caller signal provided

* fix(core): start auto-memory recall deadline at initiation time

Moves the resolveAutoMemoryWithDeadline() race from consumption time to
initiation time so the 2.5s budget is not consumed by intermediate work
(microcompact, compression, token checks, IDE context) between recall
initiation and consumption. The raced promise is stored directly in
relevantAutoMemoryPromise and simply awaited at consumption.

The pendingRecallAbortController cleanup on early return paths is
preserved unchanged.

* fix(core): address PR #3814 review feedback

- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.

* fix(core): address PR #3814 round 3 review feedback

- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>

* test(core): assert abort propagation in relevance selector

* fix(lint): remove unused eslint-disable directive in relevanceSelector.test.ts

The import/no-internal-modules eslint-disable comment was unnecessary —
the rule does not flag same-package imports like ../utils/sideQuery.js.
ESLint 9 reports it as an unused directive, failing the CI lint check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
…nLM#3814)

* fix(core): prevent auto-memory recall from blocking main request (issue QwenLM#3759)

The auto-memory recall side-query had a 5s AbortSignal.timeout that fired
on every turn, and the main request path awaited the full recall promise
(including timeout + heuristic fallback). This caused a ~5s delay on every
user turn.

Changes:
- relevanceSelector.ts: reduce model-driven selector timeout from 5s to 2s
- client.ts: add resolveAutoMemoryWithDeadline() that races the recall
  promise against a 2.5s deadline, returning empty result if recall
  hasn't completed in time
- client.test.ts: add 2 tests verifying slow recall doesn't block the
  main request and fast recall still includes memory content

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): remove stripThoughtsFromHistory from GeminiChat mocks

stripThoughtsFromHistory was removed from GeminiChat on main.
The mock in client.test.ts still referenced it, causing TS2353 build failures in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(core): address PR QwenLM#3814 review comments on auto-memory recall deadline

Three changes from the review:

1. clearTimeout leak — The setTimeout in resolveAutoMemoryWithDeadline is
   now cleared in a finally block so the dangling timer doesn't block the
   Node.js event loop from draining during graceful process exit.

2. Telemetry inflation — An AbortController is created before the recall
   call and aborted when the 2.5s deadline fires. Its signal is passed into
   the recall pipeline via a new optional abortSignal field on
   ResolveRelevantAutoMemoryPromptOptions. All three logMemoryRecall calls
   in recall.ts are gated behind !options.abortSignal?.aborted so discarded
   results don't emit success metrics.

3. Test coverage — Added a test for the !promise guard path
   (getManagedAutoMemoryEnabled() returns false) verifying sendMessageStream
   completes without calling recall or injecting memory content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(core): address PR QwenLM#3814 review findings for auto-memory recall

Critical fixes:
- Wrap onDeadline() in try/finally inside resolveAutoMemoryWithDeadline so
  resolve() always runs even if onDeadline() throws, preventing deadlock
- Forward caller abortSignal through recall() → selectRelevantAutoMemoryDocumentsByModel
  → runSideQuery, combined with the existing 2s timeout via AbortSignal.any()
  so the model API call is cancelled when the 2.5s deadline fires
- Abort and clear pendingRecallAbortController on all early return paths
  (MaxSessionTurns, BoundedTurns=0, SessionTokenLimitExceeded, Arena control signal)
- Rename _pendingRecallAbortController → pendingRecallAbortController
  (inconsistent underscore prefix)

Tests added:
- client.test.ts: verify sendMessageStream completes normally when recall rejects
- relevanceSelector.test.ts: verify caller abort signal is forwarded combined with timeout,
  and timeout-only signal when no caller signal provided

* fix(core): start auto-memory recall deadline at initiation time

Moves the resolveAutoMemoryWithDeadline() race from consumption time to
initiation time so the 2.5s budget is not consumed by intermediate work
(microcompact, compression, token checks, IDE context) between recall
initiation and consumption. The raced promise is stored directly in
relevantAutoMemoryPromise and simply awaited at consumption.

The pendingRecallAbortController cleanup on early return paths is
preserved unchanged.

* fix(core): address PR QwenLM#3814 review feedback

- resetChat(): abort pendingRecallAbortController so stale in-flight recall
  does not leak into the next session.
- recall.ts catch: distinguish AbortError (deadline-triggered cancellation)
  from real model errors. AbortError logs at debug level with a message
  indicating the heuristic result was discarded. Real model errors continue
  to log at warn with the existing fallback message.

* fix(core): address PR QwenLM#3814 round 3 review feedback

- Use explicit undefined check instead of non-null assertion for timer in
  resolveAutoMemoryWithDeadline (clearTimeout)
- Gate heuristic fallback in recall.ts on abortSignal.aborted so discarded
  results after the deadline don't produce output
- Downgrade AbortError log level from warn to debug in the client.ts recall
  catch block, keeping the warn channel meaningful for real failures
- Add tests verifying pendingRecallAbortController.abort() is called on
  MaxSessionTurns and SessionTokenLimitExceeded early-return paths

Co-Authored-By: Claude <noreply@anthropic.com>

* test(core): assert abort propagation in relevance selector

* fix(lint): remove unused eslint-disable directive in relevanceSelector.test.ts

The import/no-internal-modules eslint-disable comment was unnecessary —
the rule does not flag same-package imports like ../utils/sideQuery.js.
ESLint 9 reports it as an unused directive, failing the CI lint check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-memory recall blocks every user turn for 5s before timing out

2 participants