Skip to content

perf(agents): reuse manifest metadata during model resolution#86552

Merged
steipete merged 1 commit into
openclaw:mainfrom
TimToxopeus:alyana/resolve-default-model-manifest-normalization
May 26, 2026
Merged

perf(agents): reuse manifest metadata during model resolution#86552
steipete merged 1 commit into
openclaw:mainfrom
TimToxopeus:alyana/resolve-default-model-manifest-normalization

Conversation

@TimToxopeus

@TimToxopeus TimToxopeus commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • resolve manifest plugin metadata once while resolving configured/default model refs
  • pass the resolved manifestPlugins through alias index construction and nested model normalization instead of reloading manifest metadata repeatedly
  • add a regression test that keeps openrouter:auto manifest normalization working while proving metadata is only loaded once

Performance gained: resolve_default_model time went down from 3.80s to 0.15s

AI assistance

This PR was AI-assisted. Alyana/Codex helped identify the resolver hot path, write the patch, add the regression test, and run local validation.

Understanding

The resolver uses manifest plugin metadata to normalize provider/model refs, including aliases such as openrouter:auto and configured shorthand aliases. The issue was not that normalization was wrong, but that this codepath repeatedly reloaded the same manifest metadata while resolving aliases. This patch resolves the manifest plugin list once, preserves the default-manifest fallback behavior, and threads the resolved metadata through the nested resolver calls.

  • Observed result after fix: The resolver hot path dropped from seconds to roughly 156ms on the real Discord/session path while preserving manifest normalization for openrouter:auto and configured shorthand aliases.
  • What was not tested: I did not paste raw private Discord/session logs into this public PR. Broad full-suite failures remain in unrelated ACP env, CLI update/option-collision, and registry-backed channel contract shards.

Real behavior proof

  • Behavior or issue addressed: Real OpenClaw agent-turn model resolution was spending multiple seconds repeatedly loading manifest metadata while resolving the default model and alias index.
  • Real environment tested: Local Lumina/OpenClaw setup serving the real Discord-backed agent session for #alyana, using the same runtime path that triggered the profiling trace.
  • Exact steps or command run after this patch: Sent a real Discord request through the local OpenClaw session path and inspected the runtime resolver profiling output for the resulting agent turn after applying this patch.
  • Evidence after fix: Redacted runtime log / live output excerpt from the real OpenClaw setup after the patch:
before patch, real agent turn profiling:
resolve_default_model: 3.818s
  resolve_for_agent: 1.900s
  build_alias_index: 1.918s

after patch, same real session path:
resolve_default_model: 156ms
  resolve_for_agent: 80ms
  build_alias_index: 76ms
  • Observed result after fix: The resolver hot path dropped from 3.818s to 156ms on the same real Discord/session path while preserving manifest normalization for openrouter:auto and configured shorthand aliases.
  • What was not tested: Raw private Discord/session logs are not pasted into this public PR. Broad full-suite failures remain in unrelated ACP env, CLI update/option-collision, and registry-backed channel contract shards; no resolver/model-selection failures were observed.

Session / prompt context

The issue was found from local OpenClaw runtime profiling during real Discord request handling. I am not pasting raw private Discord/session logs into the PR body, but the sanitized trace above captures the measured before/after resolver timings from the real setup.

Validation

  • pnpm vitest run src/agents/model-selection-manifest-workspace.test.ts
  • pnpm vitest run src/agents/model-selection-resolve.test.ts src/agents/model-selection.test.ts src/gateway/session-utils.test.ts
  • pnpm build && pnpm check
  • pnpm test attempted; broad suite still has unrelated failures in ACP env, CLI update/option-collision, and registry-backed channel contract shards, with no resolver/model-selection failures observed
  • codex review --base origin/main attempted locally, but could not complete because local Codex/OpenAI auth is missing (401 Unauthorized)

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 7:25 PM ET / 23:25 UTC.

Summary
The PR threads a lazy manifest-plugin metadata context through agent model/default resolution, avoids manifest loads for empty or non-alias alias maps, updates QA/channel and plugin contract test fixtures, and adds resolver regression coverage.

PR surface: Source +174, Tests +313. Total +487 across 13 files.

Reproducibility: yes. The source path on current main still eagerly builds the alias index, and the PR plus linked issue provide redacted real-session timings; this review did not independently run the live Discord/WhatsApp path.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Rebase onto current main so CI and maintainers can review the actual merge result.

Risk before merge

  • The PR changes model/provider resolution and manifest normalization timing; existing configured aliases, provider-qualified defaults, and auth-profile suffix behavior need maintainer compatibility review even though focused regressions were added.
  • The branch is currently dirty against main, so the reviewed diff needs a rebase and CI on the exact merge result before landing.

Maintainer options:

  1. Rebase and rerun resolver gates (recommended)
    Resolve the dirty merge state, then rerun the focused resolver/model-selection tests and CI on the exact rebased head before merge.
  2. Maintainer-accept provider-routing risk
    A maintainer can accept the changed manifest-load timing after confirming configured aliases, provider-qualified defaults, and auth-profile suffix routing remain compatible.

Next step before merge
No automated repair is queued because the remaining work is rebase, CI on the rebased head, and maintainer provider-routing review rather than a narrow code defect.

Security
Cleared: No concrete security or supply-chain regression was found; the diff stays in resolver/channel/test code and does not alter secrets, dependencies, workflows, or release plumbing.

Review details

Best possible solution:

Rebase onto current main, keep the lazy manifest reuse only where provider/model normalization semantics remain covered, and land this as the narrow hot-path fix while the linked broader eager-alias issue remains tracked separately.

Do we have a high-confidence way to reproduce the issue?

Yes. The source path on current main still eagerly builds the alias index, and the PR plus linked issue provide redacted real-session timings; this review did not independently run the live Discord/WhatsApp path.

Is this the best way to solve the issue?

Yes, with maintainer review. Reusing a resolver-scoped manifest context matches the repository hot-path guidance, but the dirty branch and provider-routing compatibility surface make rebase and owner review the right next step.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against bef0ba8f5a7e.

Label changes

Label justifications:

  • P2: The PR addresses a real agent model-resolution performance problem with limited code surface but still needs normal maintainer review and rebase.
  • merge-risk: 🚨 compatibility: Model/default resolution semantics are compatibility-sensitive for existing configured models and aliases.
  • merge-risk: 🚨 auth-provider: Changing provider inference and manifest normalization timing can affect provider/model selection and auth-profile routing.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes redacted after-fix runtime log output from a real Discord-backed OpenClaw session with before/after resolver timings.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes redacted after-fix runtime log output from a real Discord-backed OpenClaw session with before/after resolver timings.
Evidence reviewed

PR surface:

Source +174, Tests +313. Total +487 across 13 files.

View PR surface stats
Area Files Added Removed Net
Source 4 218 44 +174
Tests 9 318 5 +313
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 13 536 49 +487

What I checked:

  • Current main still has the eager resolver path: On current main, resolveConfiguredModelRef builds the full alias index before checking whether the configured default is already a direct model ref, and buildModelAliasIndex parses each configured model key before checking for a non-empty alias. (src/agents/model-selection-shared.ts:575, bef0ba8f5a7e)
  • PR adds lazy manifest context: The PR head adds a resolver-scoped manifest plugin context, returns an empty alias index before manifest work when no aliases exist, and only resolves manifest metadata when alias/direct normalization actually needs it. (src/agents/model-selection-shared.ts:46, e5e0e0ba6042)
  • PR preserves alias and manifest behavior with tests: The PR adds tests for empty, wildcard-only, and alias-less configured models avoiding manifest loads, while preserving manifest normalization and slash-form alias precedence including auth-profile suffix handling. (src/agents/model-selection-manifest-workspace.test.ts:131, e5e0e0ba6042)
  • Real behavior proof is supplied: The PR body reports redacted real Discord/session resolver timings dropping from 3.818s to 156ms after the patch, and follow-up comments report targeted resolver tests, build, and a replacement CI run passing after earlier review fixes. (e5e0e0ba6042)
  • Live PR state is not merge-ready: GitHub reports the PR head as mergeable=false, rebaseable=false, mergeable_state=dirty; maintainer assignees are present, so the exact merge result needs a rebase and maintainer gate rather than auto-close. (e5e0e0ba6042)
  • Related issue shows a broader remaining hot-path problem: The linked issue reports an 80-85s event-loop block from provider-qualified default resolution and says this PR overlaps repeated manifest loading, while additional eager alias-index construction work may remain.

Likely related people:

  • steipete: Recent GitHub history shows repeated merged work on model-selection manifest context and plugin metadata snapshot hot paths, plus current PR head stewardship and assignment. (role: recent area contributor; confidence: high; commits: f00a912c2560, 7ee5fe011b27, 431467405486; files: src/agents/model-selection-shared.ts, src/plugins/plugin-metadata-snapshot.ts, src/plugins/manifest-contract-eligibility.ts)
  • shakkernerd: Recent model-selection history includes manifest metadata reuse and workspace scoping commits authored by shakkernerd, directly adjacent to the behavior this PR changes. (role: prior feature contributor; confidence: medium; commits: 20704ffab728, bfceb0d7f9a8, f5904392e9a2; files: src/agents/model-selection-shared.ts)
  • vincentkoc: Recent plugin metadata snapshot freshness work and live assignment make vincentkoc a likely reviewer for the plugin-metadata compatibility side of this PR. (role: adjacent owner; confidence: medium; commits: bf64de91918b; files: src/plugins/plugin-metadata-snapshot.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Gilded Review Wisp

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sparkles near resolved comments.
Image traits: location review cove; accessory CI status badge; palette amber, ink, and glacier blue; mood focused; pose guarding a tiny green check; shell starlit enamel shell; lighting warm desk-lamp glow; background gentle dashboard dots.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Gilded Review Wisp in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@TimToxopeus

Copy link
Copy Markdown
Contributor Author

Addressed ClawSweeper's P2 on empty alias maps. buildModelAliasIndex now returns an empty alias index before resolving manifest metadata when there are no configured model entries.

Added regression coverage asserting empty alias maps do not call loadManifestMetadataSnapshot, while the configured alias path still reuses one manifest lookup.

Local verification after the fix:

  • pnpm vitest run src/agents/model-selection-manifest-workspace.test.ts -> 2 files / 10 tests passed
  • pnpm vitest run src/agents/model-selection-resolve.test.ts src/agents/model-selection.test.ts src/gateway/session-utils.test.ts -> 6 files / 484 tests passed
  • git diff --check -> passed

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@TimToxopeus

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Addressed the remaining P2 by filtering alias candidates before resolving manifest metadata. buildModelAliasIndex now returns before manifest/plugin snapshot work when configured models are empty, wildcard-only, or contain no non-empty aliases.

Added regressions for:

  • wildcard-only configured models (anthropic/*)
  • configured model entries without aliases

Local verification after the fix:

  • pnpm vitest run src/agents/model-selection-manifest-workspace.test.ts -> 2 files / 14 tests passed
  • pnpm vitest run src/agents/model-selection-resolve.test.ts src/agents/model-selection.test.ts src/gateway/session-utils.test.ts -> 6 files / 484 tests passed
  • git diff --check -> passed

@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed proof: supplied External PR includes structured after-fix real behavior proof. proof: sufficient ClawSweeper judged the real behavior proof convincing. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 25, 2026
@clawsweeper clawsweeper Bot added status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. and removed merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@TimToxopeus

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Addressed the CI regression after the narrowed lazy manifest lookup change. resolveConfiguredModelRef now preserves manifest-backed normalization for configured static provider models, so the configured default case normalizes llama-fast to nvidia/llama-fast again instead of leaving it bare.

Latest fix:

  • ce332c7b5ffix(agents): preserve manifest normalization for configured providers

Local verification after the fix:

  • Targeted regression: src/agents/model-selection.test.ts -t "normalizes bare configured default model strings with manifest policies" -> passed
  • src/agents/model-selection.test.ts -> 101 tests passed
  • src/agents/model-selection-manifest-workspace.test.ts -> 8 tests passed
  • src/agents/model-catalog-visibility.test.ts -> 3 tests passed
  • CI's previously noisy background-abort target -> passed locally
  • pnpm build -> passed

CI follow-up:

  • Replacement run 26414889525 is green, including checks-node-agentic-agents and build-artifacts.

@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@steipete steipete self-assigned this May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@vincentkoc vincentkoc self-assigned this May 25, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@steipete steipete force-pushed the alyana/resolve-default-model-manifest-normalization branch from 5a112e2 to e5e0e0b Compare May 25, 2026 23:09
@openclaw-barnacle openclaw-barnacle Bot added channel: qa-channel Channel integration: qa-channel size: L and removed size: M labels May 25, 2026
@clawsweeper clawsweeper Bot added the merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. label May 25, 2026
@steipete steipete force-pushed the alyana/resolve-default-model-manifest-normalization branch from e5e0e0b to 5ff9501 Compare May 25, 2026 23:53
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@vincentkoc vincentkoc force-pushed the alyana/resolve-default-model-manifest-normalization branch from 5ff9501 to 1308bd7 Compare May 26, 2026 00:01
Co-authored-by: Alyana <alyana@lumina.local>
@steipete steipete force-pushed the alyana/resolve-default-model-manifest-normalization branch from 1308bd7 to 59f3546 Compare May 26, 2026 00:02
@steipete

Copy link
Copy Markdown
Contributor

Rewrote this onto current main as a single focused commit and rechecked the model-resolution/cache behavior.

Behavior addressed: avoid repeated model-manifest metadata loading in configured model resolution while preserving alias, auth-profile, OpenRouter compat, and plugin-owned normalization behavior.
Real environment tested: local macOS checkout on Node 24.15.0 plus GitHub PR CI on head 59f35465cc391b4f8b206a8b9ba964d83840b908.
Exact steps or command run after this patch:

  • fnm exec --using v24.15.0 pnpm test src/agents/model-selection.plugin-runtime.test.ts src/agents/model-selection-manifest-workspace.test.ts src/agents/model-selection.test.ts src/agents/model-selection-resolve.test.ts -- --reporter=verbose
  • fnm exec --using v24.15.0 pnpm test src/plugins/manifest-model-id-normalization.test.ts src/plugins/capability-provider-runtime.test.ts src/plugins/web-search-providers.runtime.test.ts src/plugins/web-fetch-providers.runtime.test.ts -- --reporter=verbose
  • fnm exec --using v24.15.0 env -u OPENCLAW_TESTBOX pnpm check:changed
  • .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
  • Node CPU profiles with node --import tsx --cpu-prof for static and OpenRouter model-selection paths.
    Evidence after fix: focused model suite passed 117 tests; focused plugin suite passed 67 tests; check:changed passed; autoreview reported no accepted/actionable findings; GitHub CI run 26424771500 passed, with CodeQL/OpenGrep/Socket/real-behavior checks green or skipped as expected.
    Observed result after fix: static configured-model resolution stayed on the no-manifest fast path at about 0.0038 ms/run; OpenRouter compat path measured about 4.82 ms/run cold and about 2.05 ms/run with current metadata snapshot.
    What was not tested: no live provider call was made; this is resolver/cache behavior covered by unit, contract, build/check, and PR CI proof.

@steipete steipete merged commit ea2496b into openclaw:main May 26, 2026
98 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: qa-channel Channel integration: qa-channel merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: L status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants