perf(memory): prewarm explicit local embeddings on gateway startup#70596
perf(memory): prewarm explicit local embeddings on gateway startup#70596taosiyuan163 wants to merge 2 commits into
Conversation
Greptile SummaryThis PR extends Confidence Score: 5/5Safe to merge — change is narrowly scoped, error paths are guarded, and new tests cover the key decision branches. All remaining observations are P2 or lower. The implementation logic is correct, error handling is complete (try/catch wrapping the async probe plus probe.ok check), and the startup path is non-blocking for all non-local configurations. No files require special attention. Reviews (1): Last reviewed commit: "perf(memory): prewarm builtin local embe..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d47f64291
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
|
|
||
| try { | ||
| const probe = await manager.probeEmbeddingAvailability(); |
There was a problem hiding this comment.
Avoid caching failed builtin prewarm initialization
Calling probeEmbeddingAvailability() here runs against the default (cached) builtin manager, so a transient provider-init failure during startup can permanently break memory search for that process. MemoryIndexManager.get(...) reuses the same default manager instance, and ensureProviderInitialized() keeps a rejected init promise on failure (it only clears after successful initialization), so once prewarm fails early (for example while local model assets are still unavailable), subsequent memory_search calls keep reusing a poisoned manager instead of retrying later when the provider is ready.
Useful? React with 👍 / 👎.
9d47f64 to
db2f97d
Compare
|
Pushed follow-up commit f325ad4 to relax the Codex missing-auth contract assertion so equivalent openai/gpt-5.5 and openai-codex/gpt-5.5 guidance both satisfy the shared test helper. Local verification:
The new PR head currently only has the pull_request_target labeler suite attached. If pull_request CI is awaiting approval / rerun for the updated fork head, please re-enable or rerun the CI workflow on this latest commit. |
f325ad4 to
8296e9c
Compare
|
Codex review: needs real behavior proof before merge. Reviewed June 9, 2026, 1:08 AM ET / 05:08 UTC. Summary PR surface: Source +79, Tests +134, Docs +1. Total +214 across 8 files. Reproducibility: Partly: source inspection shows current main skips builtin memory in both the post-attach scheduler and startup helper, so the central prewarm path is absent. I did not establish a live successful local-model timing reproduction against current main. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Land a rebased, memory-only change only after maintainers choose the startup policy and the PR proves successful local prewarm plus first-search reuse while preserving current worker isolation; otherwise keep builtin local memory lazy by default. Do we have a high-confidence way to reproduce the issue? Partly: source inspection shows current main skips builtin memory in both the post-attach scheduler and startup helper, so the central prewarm path is absent. I did not establish a live successful local-model timing reproduction against current main. Is this the best way to solve the issue? No: the implementation is a plausible mitigation, but not yet the best merge shape because it forces eager local startup work for existing explicit-local configs without successful real proof or a settled startup policy. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 03a8d18cd434. Label changesLabel justifications:
Evidence reviewedPR surface: Source +79, Tests +134, Docs +1. Total +214 across 8 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
8296e9c to
9869fa5
Compare
|
Rebased this PR onto current What changed:
Local verification:
|
9869fa5 to
a0ccc61
Compare
|
Pushed another update for the Clawsweeper scheduler finding. Additional fix:
Updated local verification:
|
|
Added one more real-output check for the post-attach scheduler wiring on the latest head: This is from the production build output after |
|
@clawsweeper re-review Latest head |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
CI is green, but ClawSweeper's latest blocker looks like an owner-level call rather than a test failure: the PR moves explicit builtin/local prewarm into Gateway startup, while #75657 argues the current in-process @steipete could you take a look and advise whether this branch should:
Happy to rework the PR in whichever direction matches the intended runtime boundary. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
Summary
memory_searchwith explicitmemorySearch.provider = "local"still pays model/context initialization on the first post-startup search because gateway startup only warmed QMD.startGatewayMemoryBackendto prewarm builtin local managers by callingprobeEmbeddingAvailability()after manager creation, and add gateway startup tests for explicit-local, skip, and failure logging paths.provider = "auto"or remote embedding providers, so startup avoids new remote network calls or auth/cost side effects.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
memorySearch.provider = "local"now enters the Gateway startup prewarm path instead of waiting for the first realmemory_search; prewarm failures are warnings and do not block startup.pnpm build, temporary OpenClaw state/workspace, Node invoking the builtdist/server-startup-memory-*.jsmodule.memory.backend=builtin,agents.defaults.memorySearch.provider=local, vector disabled, and a temporarymemory/MEMORY.mdfile.main, surfaced the real local-provider availability error as a warning, and completed without throwing or blocking startup.node-llama-cppinstalled on this machine; the real setup here verifies the startup/prewarm routing and graceful failure path.startGatewayMemoryBackendunless the resolved backend was QMD, so this same builtin/local startup path did not call the memory manager prewarm probe.Root Cause (if applicable)
purpose: "status"managers bypass the builtin cache, so a status-only probe would not have helped the first real search.Regression Test Plan (if applicable)
src/gateway/server-startup-memory.test.tsprobeEmbeddingAvailability()at startup, non-local builtin configs skip startup work, and probe failures degrade to warnings.User-visible / Behavior Changes
agents.*.memorySearch.provider = "local"now shift local embedding cold-start work into post-attach startup, so the first realmemory_searchrequest avoids paying that initialization cost.provider = "auto", or remote memory embedding providers.Diagram (if applicable)
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation:Repro + Verification
Environment
pnpmmemory.backend=builtin,agents.defaults.memorySearch.provider=localSteps
memory_searchrequest after startup.Expected
Actual
Evidence
Human Verification (required)
pnpm test src/gateway/server-startup-memory.test.ts;pnpm buildReview Conversations
Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoRisks and Mitigations
memory.backend=builtinplus explicitprovider=local, after post-attach startup, and failures degrade to warnings without blocking startup.