Skip to content

fix(agents): preserve runtime tools in lean mode#88381

Merged
steipete merged 1 commit into
mainfrom
codex/local-model-lean-required-tools
May 31, 2026
Merged

fix(agents): preserve runtime tools in lean mode#88381
steipete merged 1 commit into
mainfrom
codex/local-model-lean-required-tools

Conversation

@vincentkoc

@vincentkoc vincentkoc commented May 30, 2026

Copy link
Copy Markdown
Member

Summary

  • keeps localModelLean from stripping runtime-required tools when replies must go through the message tool
  • preserves explicit runtime allowlists plus forceMessageTool and sourceReplyDeliveryMode: "message_tool_only" through both tool construction and embedded-run schema projection
  • adds focused coverage for forced message tools, message-tool-only replies, grouping, and wildcard preservation without disabling lean filtering

Verification

  • node scripts/run-vitest.mjs src/agents/local-model-lean.test.ts src/agents/agent-tools.create-openclaw-coding-tools.test.ts src/agents/embedded-agent-runner/run/attempt.test.ts --reporter=dot passed, 190 tests
  • git diff --check origin/main...HEAD passed
  • AWS Crabbox pnpm check:changed passed: provider aws, lease cbx_05e9c25d9687, slug crimson-hermit, run run_e0161fcff3cd, machine c7a.8xlarge, exit 0, leaseStopped=true

What was not tested

  • Live Ollama/provider run. Local Ollama API was unavailable on this machine, so this is covered by focused runtime-tool tests plus the remote changed gate.

@openclaw-barnacle openclaw-barnacle Bot added the agents Agent runtime and tooling label May 30, 2026
@vincentkoc vincentkoc self-assigned this May 30, 2026
@openclaw-barnacle openclaw-barnacle Bot added size: S maintainer Maintainer-authored PR labels May 30, 2026
@clawsweeper

clawsweeper Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 31, 2026, 9:49 AM ET / 13:49 UTC.

Summary
The PR updates local-model lean filtering so runtime-required or explicitly allowed tools, especially message, survive tool construction and embedded-run schema projection, with focused regression coverage.

PR surface: Source +57, Tests +138. Total +195 across 5 files.

Reproducibility: yes. Source inspection shows current main forces message for message_tool_only but then runs lean filtering that removes message without a preservation path.

Review metrics: 1 noteworthy metric.

  • Lean message-delivery gates: 2 runtime filter paths updated. Both construction-time and embedded schema-projection filtering must preserve message for source replies to reach users.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🌊 off-meta tidepool
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] This PR changes a message-delivery gate: a bad merge could still suppress message_tool_only replies or expose message more broadly than intended in lean local-model runs.
  • [P1] The PR body says no live Ollama/provider run was performed, so final landing depends on maintainers accepting focused runtime-tool tests plus Crabbox changed-gate proof or running an additional live check.

Maintainer options:

  1. Land With Refreshed Merge Proof (recommended)
    Before landing, re-run the focused test command or changed gate on the final merge result so the lean-mode message_tool_only path remains covered after base drift.
  2. Accept Focused Coverage
    Maintainers can intentionally accept the unit and Crabbox proof if a live Ollama/provider run is not necessary for this internal tool-surface fix.
  3. Pause For Live Provider Proof
    If maintainers want runtime proof, pause until a live local-provider run demonstrates a message_tool_only reply with localModelLean enabled.

Next step before merge

  • [P2] The PR is maintainer-labeled and has no actionable automated repair; the remaining action is maintainer landing and final proof judgment.

Security
Cleared: The diff only changes agent tool-filtering code and tests; it does not add dependencies, workflows, install scripts, secret handling, or artifact execution paths.

Review details

Best possible solution:

Land the narrow preserve-through-filtering fix after maintainer review with the focused tests or changed gate refreshed against the final merge result.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main forces message for message_tool_only but then runs lean filtering that removes message without a preservation path.

Is this the best way to solve the issue?

Yes. Centralizing preserved tool names and threading them through both tool construction and embedded schema projection is a narrow fix for the observed gap without disabling lean filtering.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 0d17623f0090.

Label changes

Label changes:

  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🌊 off-meta tidepool and patch quality is 🦞 diamond lobster.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P2: This is a normal-priority bug fix for a limited local-model lean/message-tool delivery path, with focused coverage and no emergency signal.
  • merge-risk: 🚨 message-delivery: The diff changes whether the message tool survives lean filtering, which can affect visible reply delivery in message_tool_only runs.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🌊 off-meta tidepool and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The contributor proof gate does not apply to this maintainer-authored PR; the body still records focused tests and a Crabbox changed-gate run.
Evidence reviewed

PR surface:

Source +57, Tests +138. Total +195 across 5 files.

View PR surface stats
Area Files Added Removed Net
Source 3 60 3 +57
Tests 2 139 1 +138
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 5 199 4 +195

What I checked:

  • Repository policy read: Root and scoped agent policies were read; the maintainer/protected-label and message-delivery review guidance applies to this PR. (AGENTS.md:11, 0d17623f0090)
  • Scoped agent guidance read: The agent scoped policy favors focused helper tests for this surface, which matches the patch's approach. (src/agents/AGENTS.md:1, 0d17623f0090)
  • Current main strips lean-denied tools unconditionally: Current filterLocalModelLeanTools removes browser, cron, and message whenever lean mode is enabled, with no preservation hook. (src/agents/local-model-lean.ts:6, 0d17623f0090)
  • Current embedded run forces message but later projects through lean filtering: Current main forces message for message_tool_only, then calls filterLocalModelLeanTools during schema projection without passing that runtime requirement through. (src/agents/embedded-agent-runner/run/attempt.ts:1091, 242eab9d20f7)
  • PR diff preserves runtime-required tools in both affected paths: The head diff adds a preserve-name resolver, threads it through createOpenClawCodingTools, and passes it into embedded-run schema projections. (src/agents/local-model-lean.ts:2, 9502f7489cb1)
  • Validation evidence: The PR body records the focused Vitest command passing 190 tests, git diff --check passing, and AWS Crabbox pnpm check:changed passing on run run_e0161fcff3cd. (9502f7489cb1)

Likely related people:

  • steipete: Git blame attributes the current local-model lean helper and nearby embedded-run filtering/projection code to Peter Steinberger, and recent commits also touch the embedded runner path. (role: recent area contributor; confidence: high; commits: 242eab9d20f7, 1e54e908e2e4, f24a13879095; files: src/agents/local-model-lean.ts, src/agents/agent-tools.ts, src/agents/embedded-agent-runner/run/attempt.ts)
  • yaoyi1222: Recent current-main work touched message-tool final-reply behavior, which is adjacent to the source-reply delivery mode protected by this PR. (role: adjacent source-reply contributor; confidence: medium; commits: 75e0053cf969; files: src/auto-reply/reply/agent-runner.ts, src/auto-reply/reply/private-message-tool-final.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels May 30, 2026
@vincentkoc vincentkoc force-pushed the codex/local-model-lean-required-tools branch 2 times, most recently from 63f0e20 to 33d998d Compare May 31, 2026 04:27
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 31, 2026
@vincentkoc vincentkoc force-pushed the codex/local-model-lean-required-tools branch from 33d998d to 23f31e8 Compare May 31, 2026 13:19
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 31, 2026
@vincentkoc vincentkoc force-pushed the codex/local-model-lean-required-tools branch from 23f31e8 to e4573b6 Compare May 31, 2026 13:28
@vincentkoc vincentkoc force-pushed the codex/local-model-lean-required-tools branch from e4573b6 to 9502f74 Compare May 31, 2026 13:37
@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 31, 2026
@steipete

Copy link
Copy Markdown
Contributor

Maintainer verification before merge:

Behavior addressed: local-model lean filtering could remove runtime-required tools, including message, after the runtime had already decided that visible replies must use the message tool.
Real environment tested: local focused Vitest in this checkout on PR head 9502f7489cb1fed649ea3c78ed158a632abce1c4; GitHub CI on the same head.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/agents/local-model-lean.test.ts src/agents/agent-tools.create-openclaw-coding-tools.test.ts src/agents/embedded-agent-runner/run/attempt.test.ts --reporter=dot; git diff --check origin/main...HEAD; /opt/homebrew/opt/gh/bin/gh pr checks 88381 --watch=false.
Evidence after fix: focused Vitest passed 190 tests; diff check passed; PR checks have no fail/pending entries; Real behavior proof passed in CI.
Observed result after fix: lean mode still filters denied tools, but preserves runtime-allowed/forced message for message_tool_only reply paths and schema projection.
What was not tested: no live local-provider/Ollama run; local Ollama proof remains intentionally absent from the PR body.

@steipete steipete merged commit 4d135ae into main May 31, 2026
158 of 160 checks passed
@steipete steipete deleted the codex/local-model-lean-required-tools branch May 31, 2026 14:43
vincentkoc added a commit that referenced this pull request May 31, 2026
…n-rotation-current

* origin/main: (52 commits)
  fix(agents): prevent embedded runtime shadowing
  fix(outbound): route source replies through configured channels
  refactor(cron): split tool and doctor repair helpers
  perf: reduce tui refresh work
  feat: default exec shell snapshots
  fix(ui): keep chat usable during session loading
  fix(cron): guard flat atMs canonicalization
  refactor(cron): keep runtime on canonical sqlite rows
  fix(codex): restore bounded recovery continuity
  refactor: clean up ACP package metadata and helpers (#88659)
  fix(discord): ping mention-bearing final replies
  fix(telegram): preserve usage footer for tool-only replies
  fix(agents): avoid alias setup load for matching refs
  chore(ui): translate thinking default label
  fix(agents): preserve runtime tools in lean mode (#88381)
  fix(messages): use best-effort for implicit tool-only source replies (#84232)
  docs: raise bulk PR close threshold
  feat: add exec shell snapshot cache
  fix: use typed tui empty session defaults
  perf: speed up tui session refresh
  ...

# Conflicts:
#	src/tui/tui-command-handlers.test.ts
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 1, 2026
fix(agents): preserve runtime tools in lean mode

Keep runtime-required tools, especially `message`, available when local-model lean filtering is enabled. This preserves `forceMessageTool`, `message_tool_only` source replies, explicit runtime allowlists, and schema projection without disabling lean filtering for ordinary denied tools.

Proof: focused Vitest passed 190 tests; `git diff --check origin/main...HEAD` passed; PR CI had no failing or pending checks.
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
fix(agents): preserve runtime tools in lean mode

Keep runtime-required tools, especially `message`, available when local-model lean filtering is enabled. This preserves `forceMessageTool`, `message_tool_only` source replies, explicit runtime allowlists, and schema projection without disabling lean filtering for ordinary denied tools.

Proof: focused Vitest passed 190 tests; `git diff --check origin/main...HEAD` passed; PR CI had no failing or pending checks.
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
fix(agents): preserve runtime tools in lean mode

Keep runtime-required tools, especially `message`, available when local-model lean filtering is enabled. This preserves `forceMessageTool`, `message_tool_only` source replies, explicit runtime allowlists, and schema projection without disabling lean filtering for ordinary denied tools.

Proof: focused Vitest passed 190 tests; `git diff --check origin/main...HEAD` passed; PR CI had no failing or pending checks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: M status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants