Skip to content

fix(agents): avoid eager alias normalization for default models#86671

Merged
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/provider-default-alias-fast-path-86635
May 27, 2026
Merged

fix(agents): avoid eager alias normalization for default models#86671
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/provider-default-alias-fast-path-86635

Conversation

@TurboTheTurtle

@TurboTheTurtle TurboTheTurtle commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • avoid building a full model alias index when resolving a configured default model
  • skip model key parsing for configured model entries that do not define an alias
  • keep slash-form configured aliases working while making provider-qualified defaults resolve through the direct parse path

Real behavior proof

Behavior addressed: provider-qualified default model resolution no longer walks and normalizes every configured model just to resolve openai/gpt-5.5 on the inbound reply hot path.
Real environment tested: local OpenClaw source checkout on macOS at current PR head c8506316c7c05748c3389b5a4c4c47344579519b, using Node v24.15.0 with tsx to execute the production model-selection source directly.
Exact steps or command run after this patch: node --import tsx - <<'EOF' executed resolveConfiguredModelRef and buildModelAliasIndex from ./src/agents/model-selection.ts against a synthetic config with 97 aliasless configured models plus two configured aliases, then asserted that the provider-qualified default resolved to openai/gpt-5.5, the alias index contained only the two real aliases, and the slash-form alias preserved the routed model path. The proof data uses synthetic model names only.
Evidence after fix: Copied console output from the real command:

{
  "head": "c8506316c7c05748c3389b5a4c4c47344579519b",
  "scenario": "provider-qualified default model with many aliasless configured models",
  "configuredModelCount": 99,
  "resolved": {
    "provider": "openai",
    "model": "gpt-5.5"
  },
  "aliasCount": 2,
  "miniAlias": {
    "provider": "openai",
    "model": "gpt-5.5-mini"
  },
  "slashAlias": {
    "provider": "openrouter",
    "model": "xiaomi/mimo-v2-pro-mit"
  },
  "elapsedMs": 92.697
}
PASS current-head provider-qualified default resolved without indexing aliasless configured models

Observed result after fix: resolveConfiguredModelRef resolved the provider-qualified default directly to openai/gpt-5.5; buildModelAliasIndex ignored the 97 aliasless configured model entries and returned only the two configured aliases; the routed slash-form alias still resolved to openrouter/xiaomi/mimo-v2-pro-mit.
What was not tested: a live WhatsApp/gateway workload with the reporter's exact provider plugin normalization latency; the local proof exercises the production source hot path and regression shape from #86635 without private user data.

Verification

  • node --import tsx - <<'EOF' current-head production model-selection proof above
  • node scripts/run-vitest.mjs src/agents/model-selection.test.ts
  • node scripts/run-vitest.mjs src/agents/tools/message-tool.test.ts --testNamePattern 'uses a non-webchat session key'
  • pnpm exec oxfmt --check --threads=1 src/agents/model-selection.test.ts
  • git log --format='%h %an <%ae> %s' upstream/main..HEAD

Fixes #86635

Co-authored-by: Andy Ye 35905412+TurboTheTurtle@users.noreply.github.com

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 11:02 PM ET / 03:02 UTC.

Summary
This PR adds model-selection regression tests that mock provider model normalization and assert aliasless configured models are not normalized while provider-qualified defaults still resolve directly.

PR surface: Tests +61. Total +61 across 1 file.

Reproducibility: yes. the linked bug gives a concrete source path and timing evidence, and current source inspection shows the direct-resolution fix path. I did not run a failing current-main repro because main already contains the lazy/direct-resolution implementation.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none

Next step before merge
No repair lane is needed; the patch has no actionable code defect and the remaining action is normal PR review and CI handling.

Security
Cleared: The diff is test-only and does not touch dependencies, workflows, credentials, package resolution, or runtime security boundaries.

Review details

Best possible solution:

Land the focused regression tests after normal CI and maintainer review, then close the linked bug when the implementation plus coverage are on main.

Do we have a high-confidence way to reproduce the issue?

Yes: the linked bug gives a concrete source path and timing evidence, and current source inspection shows the direct-resolution fix path. I did not run a failing current-main repro because main already contains the lazy/direct-resolution implementation.

Is this the best way to solve the issue?

Yes: as a test-only follow-up, the PR checks the hot-path invariants in the existing model-selection unit file without adding new runtime behavior or config surface.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c9d0464ed116.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes current-head terminal proof from executing the production model-selection source against a synthetic catalog and observing the expected direct resolution and alias filtering.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes current-head terminal proof from executing the production model-selection source against a synthetic catalog and observing the expected direct resolution and alias filtering.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P1: The PR is tied to a high-impact agent/gateway event-loop starvation bug that can delay inbound replies and channel health checks for real users.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes current-head terminal proof from executing the production model-selection source against a synthetic catalog and observing the expected direct resolution and alias filtering.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes current-head terminal proof from executing the production model-selection source against a synthetic catalog and observing the expected direct resolution and alias filtering.
Evidence reviewed

PR surface:

Tests +61. Total +61 across 1 file.

View PR surface stats
Area Files Added Removed Net
Source 0 0 0 0
Tests 1 62 1 +61
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 1 62 1 +61

What I checked:

  • PR diff adds call-count coverage for provider normalization: The PR head adds a hoisted provider-normalization mock and asserts buildModelAliasIndex only calls runtime normalization for the configured alias entry, not the 25 aliasless entries. (src/agents/model-selection.test.ts:100, c8506316c7c0)
  • PR diff covers provider-qualified default resolution: The PR head adds a regression test where primary model openai/gpt-5.5 resolves while 25 aliasless configured models do not trigger per-entry provider runtime normalization. (src/agents/model-selection.test.ts:1808, c8506316c7c0)
  • Current source skips aliasless configured-model rows: Current main builds alias candidates from entries with a non-empty alias, so aliasless defaults are filtered before model-key parsing and runtime normalization. (src/agents/model-selection-shared.ts:107, c9d0464ed116)
  • Current source uses direct provider-qualified parsing after alias scan: resolveConfiguredModelRef now scans configured aliases directly and then resolves slash-form provider/model strings without first constructing the full alias index. (src/agents/model-selection-shared.ts:663, c9d0464ed116)
  • Runtime call path matches the linked hot path: Inbound replies still call resolveDefaultModel before agent dispatch, and that helper resolves the default model then builds the alias index for later directive handling, matching the linked bug surface. (src/auto-reply/reply/get-reply.ts:252, c9d0464ed116)
  • Implementation provenance: Commit ea2496b introduced the lazy manifest context, alias-candidate filtering, and direct default resolution behavior that this PR tests. (src/agents/model-selection-shared.ts:488, ea2496b00c5c)

Likely related people:

  • Deepflame: git show and git blame tie the current lazy manifest/alias-resolution implementation in model-selection-shared.ts to ea2496b. (role: recent area contributor; confidence: high; commits: ea2496b00c5c; files: src/agents/model-selection-shared.ts, src/agents/model-selection.test.ts)
  • @vincentkoc: git blame and recent history show 84929e4 carried the surrounding model-selection shared/test baseline immediately before the lazy-resolution change. (role: recent adjacent contributor; confidence: medium; commits: 84929e426506; files: src/agents/model-selection-shared.ts, src/agents/model-selection.ts, src/agents/model-selection.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Neon Patch Peep

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: polishes edge cases.
Image traits: location flaky test forest; accessory miniature diff map; palette pearl, teal, and neon green; mood mischievous; pose sitting proudly on a smooth stone; shell matte ceramic shell; lighting subtle sparkle highlights; background miniature CI buoys.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Neon Patch Peep in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 26, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the fix/provider-default-alias-fast-path-86635 branch 2 times, most recently from 7861208 to e3802f4 Compare May 26, 2026 00:21
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 26, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the fix/provider-default-alias-fast-path-86635 branch from e3802f4 to c850631 Compare May 26, 2026 00:43
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 26, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
@TurboTheTurtle

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 26, 2026
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 26, 2026
@steipete steipete self-assigned this May 27, 2026
@steipete

Copy link
Copy Markdown
Contributor

Pre-merge verification for #86671:

Behavior addressed: regression coverage for provider-qualified default model resolution avoiding eager alias work across aliasless configured models.
Real environment tested: local OpenClaw checkout on main after syncing origin/main, plus PR head CI.
Exact steps or command run after this patch:

  • node scripts/run-vitest.mjs src/agents/model-selection.test.ts
  • node --import tsx -e "const {resolveConfiguredModelRef, buildModelAliasIndex}=await import('./src/agents/model-selection.ts'); const models=Object.fromEntries(Array.from({length:97},(_,i)=>['anthropic/claude-extra-'+i,{}])); const cfg={agents:{defaults:{model:{primary:'openai/gpt-5.5'},models:{...models,'openai/gpt-5.5-mini':{alias:'mini'},'openrouter/xiaomi/mimo-v2-pro-mit':{alias:'router/xiaomi'}}}}}; const resolved=resolveConfiguredModelRef({cfg,defaultProvider:'anthropic',defaultModel:'claude-sonnet-4-6'}); const idx=buildModelAliasIndex({cfg,defaultProvider:'anthropic'}); console.log(JSON.stringify({resolved,aliasCount:idx.byAlias.size,mini:idx.byAlias.get('mini')?.ref,slashAlias:idx.byAlias.get('router/xiaomi')?.ref},null,2)); if(resolved.provider!=='openai'||resolved.model!=='gpt-5.5'||idx.byAlias.size!==2) process.exit(1);"
  • git diff --check origin/main...refs/remotes/pull/86671 -- src/agents/model-selection.test.ts
    Evidence after fix: Vitest passed 206 tests across 2 files; inline production-path proof resolved openai/gpt-5.5 and kept only 2 aliases with 97 aliasless configured models; diff whitespace check was clean.
    Observed result after fix: the current production path already resolves the provider-qualified default directly and the PR adds focused regression coverage for that behavior.
    CI: gh pr checks 86671 shows relevant required lanes passing, including CI, check-lint, check-prod-types, check-test-types, checks-node-agentic-agents, security-fast, OpenGrep, CodeQL/security selected lanes, and Real behavior proof. Routine auto-response/label/docs/non-selected lanes were ignored or skipped.
    What was not tested: no live gateway/WhatsApp workload; this PR is test-only and current production behavior was proven locally on the model-selection path.

@steipete steipete merged commit 2e3b4b5 into openclaw:main May 27, 2026
150 of 161 checks passed
@TurboTheTurtle TurboTheTurtle deleted the fix/provider-default-alias-fast-path-86635 branch May 27, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Provider-qualified default model resolution eagerly builds alias index and can block gateway event loop ~80s

2 participants