Auto-memory recall blocks every user turn for 5s before timing out

### What happened?

**Every user turn is delayed by close to 5 seconds before the main model sees the prompt**, because the auto-memory recall selector is awaited on the request path and consistently times out.

The selector is fired off early as a non-blocking promise so that other prep work (history compaction, IDE context gathering, etc.) can run concurrently. Just before the main request is sent, the code awaits the recall promise. When the recall does not finish within the prep window — which is the situation in this issue — the main query is delayed by roughly `5s − prepDuration`. On a typical turn where prep takes a few hundred milliseconds, that is close to a full 5-second per-turn slowdown. The user perceives every prompt as starting late.

The underlying error is an `AbortError: This operation was aborted` from the side query's own 5-second deadline:

```
"error": {
  "message": "This operation was aborted",
  "stack": "AbortError: This operation was aborted
    ...
    at AbortSignal.<anonymous> (file:///.../@qwen-code/qwen-code/cli.js:155551:61)
    ...
    at Timeout._onTimeout (node:internal/abort_controller:139:7)
    at listOnTimeout (node:internal/timers:594:17)"
}
```

The abort itself is swallowed by a `.catch` — there is no error in the UI — but the `await` still waits for the timer to fire. So the failure mode is silent and consistent: every turn looks slow, and the recall feature also silently degrades to "no memories surfaced" since the selector never produces a result.

### What did you expect to happen?

The recall selector should complete within its budget on a typical setup so that the main agent is not delayed by the recall path on every turn.

### Client information

<details>
<summary>Client Information</summary>

```console
$ qwen /about
# will provide on request
```

</details>

### Login information

API Key (OpenAI-compatible, DeepSeek). Main model: `deepseek-v4-pro`.

### Anything else we need to know?

**Root of the problem**

The recall selector currently runs against the main session model with a 5-second deadline. On a reasoning-heavy main model like `deepseek-v4-pro`, that budget is not realistic — thinking tokens alone can exceed it before any structured output is produced — so the deadline fires every turn. The user pays the latency penalty regardless of whether the recall feature would have helped on that turn.

**Proposed solution: route this side query through `fastModel`**

The recall selector is a small, schema-constrained, latency-sensitive classification task — exactly the shape of work that `fastModel` was introduced for. Other background tasks (auto session titles, tool-use summaries, kebab-case rename) already use `fastModel`; the recall selector appears to have been overlooked.

The proposed change is to have the selector prefer `fastModel` when the user has configured one (via `/model --fast <id>` or `settings.json`), and fall back to the main model when it is unset. Users who already have a fast model configured see the per-turn latency penalty disappear immediately and get reliable recall results. Users who have not continue on today's behavior with no regression, and can opt in by setting `fastModel`.

This is preferred over simply raising the 5-second deadline: the heavy main model is the wrong tool for this workload, and a longer deadline would only deepen the per-turn latency penalty when the call eventually succeeds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-memory recall blocks every user turn for 5s before timing out #3759

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Auto-memory recall blocks every user turn for 5s before timing out #3759

Description

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions