[Feature] Tool set cleanup: rewrite descriptions, remove redundant tools, harden deletion guardrails

## What task are you trying to do?

We want PawWork's base tool surface to be smaller, more honest, and harder for models to misuse. This issue covers three concurrent strands:

1. **Description cleanup** (including over-push framing in `todowrite.txt`): remove fictional examples, delegation hype, lying behavioral claims, content that teaches wrong tool boundaries, framing that pushes tool selection as performance rather than organization, and filling teaching gaps that bias weaker models on common tool boundaries (e.g. line-deletion semantics in `edit.txt`).
2. **Redundant tool removal**: delete tools whose function fully overlaps another tool or whose default-path exposure is zero.
3. **Deletion guardrail hardening**: stop the bash description from teaching `rm` as a normal example, and strengthen the trash-vs-rm boundary.

The goal is moving PawWork toward Claude Code's tool count (~15) and away from Gemini CLI's (~23), without going as bare as Codex CLI's (~8) which is GPT-only and would break PawWork's open-to-all-models stance.

## What do you do today?

`task.txt` embeds two fictional example agents (`code-reviewer`, `greeting-responder`) that do not exist in the PawWork subagent list. It explicitly tells models to "Launch multiple agents concurrently whenever possible" and that "The agent's outputs should generally be trusted" — both push delegation harder than PawWork's product direction warrants.

`glob.txt` tells models to "speculatively perform multiple searches as a batch".

`skill.txt` says skills are "listed in the system prompt" in **two places** (L1 and L5); the actual mechanism is `registry.ts:283` (`describeSkill`) appending the list at the end of the skill tool description itself — both pointers are literally wrong.

`multiedit.txt` claims atomicity ("All edits must be valid for the operation to succeed - if any edit fails, none will be applied"), but `multiedit.ts:38-50` is a plain `for` loop calling `edit.execute()` with each successful write landing immediately and no rollback. The description lies about runtime behavior; users get half-edited files when later edits fail.

A separate finding from the spec audit: `multiedit` is **not currently registered in `registry.ts` builtin tool list**. The tool is referenced as string literals or test/spec mentions in 9 paths: `packages/opencode/src/permission/index.ts:297` (`EDIT_TOOLS = ["edit", "write", "apply_patch", "multiedit"]`), `packages/opencode/src/config/agent.ts:111` (legacy migration union check), `packages/opencode/src/config/config.ts:733` (legacy migration union check), `packages/opencode/specs/effect-migration.md:269` (one-line spec mention), `packages/opencode/test/agent/agent.test.ts:493` (test name), `packages/opencode/test/config/config.test.ts:2006-2030` (entire `migrates legacy multiedit tool to edit permission` test block — multiedit-specific migration test), and `packages/opencode/test/permission/next.test.ts:390-401` (test name + array entry + assertion). So removing multiedit does NOT change any registered model's tool surface; it removes dead code, the lying description, three string-literal references in src, one spec mention, and three test references. **Per user decision: PawWork has no legacy users / no legacy config in the wild, so the entire multiedit migration code path is removed (not preserved as a no-op).**

`codesearch.txt` is gated to `providerID === ProviderID.opencode || Flag.OPENCODE_ENABLE_EXA` (`registry.ts:319-321`). The same gate also applies to `WebSearchTool.id` in the same OR condition — the gate covers both tools. PawWork users on the default Zen provider never see codesearch, but websearch goes through this same gate and must be preserved when codesearch is removed. The codesearch description still pushes "Use this tool for ANY question or task related to programming"; the `.txt` + `.ts` + dispatcher branch sit in the codebase for no PawWork value.

`bash.txt` L16 uses `rm "path with spaces/file.txt"` as the *quoting example*, then never warns against `rm` anywhere. `trash.txt` (4 lines) tries to redirect deletion through trash, but bash.txt's reverse-push is louder and models keep reaching for `rm` then failing.

Important context for the "then failing" half: the patterns `rm *`/`rmdir *`/`unlink *`/`find * -delete*`/`sudo *`/`dd *`/`mkfs*`/`chmod *`/`kill *` are all set to `deny` in PawWork's build-agent default permission ruleset at `packages/opencode/src/agent/agent.ts:99-110`, validated by `packages/opencode/test/permission/pawwork-defaults.test.ts`. So `rm` calls already hard-fail at the permission layer — this is an existing PawWork carveout over upstream, not something this issue introduces. Note: the deny ruleset covers POSIX deletion paths only; Windows deletion commands (`del`, `erase`, `rd`, `Remove-Item`) are NOT currently denied — tracked as a separate signal for follow-up after PR1 lands. The bash.txt + trash.txt edits in PR1 reduce the wasted first-attempt that ends in a deny error and stop bash.txt from actively suggesting `rm` as the canonical example; they are NOT the layer doing the actual blocking.

`bash.txt` does not have a single "DO NOT" list section — current text is prose plus a bullet list at L31 (`Avoid using Bash with the find, grep, cat, head, tail, sed, awk, or echo commands...`). The rm guardrail will be added by extending that L31 bullet list, not by inserting a new "DO NOT" section.

`bash.txt` L86 (`NEVER use the TodoWrite or Task tools`) and L113 (`DO NOT use the TodoWrite or Task tools`) instruct models against TodoWrite/Task tool use during git/PR workflows — this contradicts PawWork's project-level encouragement of TodoWrite for multi-step work. Both lines sit *inside* the deferred 66-line git/PR workflow block; the surgical deletion of these two lines is an explicit exception to the otherwise-deferred status of that block.

`apply_patch.txt` L12 and L26 contain `*** Delete File: <path>` syntax which lets the model delete files directly through the patch envelope, bypassing the trash guardrail. The first audit pass missed this; apply_patch.txt is **not** clean and needs a guardrail line added (prose-level only — runtime dispatcher enforcement is out of scope, same upstream-foundation reasoning as keeping `apply_patch` itself).

`todowrite.txt` L1 frames the tool as helping the user "track progress, organize complex tasks, and demonstrate thoroughness to the user". L166 reads `When in doubt, use this tool. Being proactive with task management demonstrates attentiveness and ensures you complete all requirements successfully.` Both push using the tool as performance, not organization.

`edit.txt` is currently 10 lines of exact-replacement semantics but does not teach the line-deletion boundary case. Weaker models routinely delete a single line by passing the line content as `oldString` and an empty `newString` but forget the trailing newline character — leaving a stray blank line at the deleted position. PawWork's default audience includes weaker models (GLM-5.1, Kimi K2.6, Qwen Coder); for them this teaching gap matters more than for stronger models that have absorbed the convention from training data. This is not a "teach wrong" failure but a "fail to teach" failure on a known stumbling block.

Deferred (still not in this issue): `bash.txt` ~64 lines of remaining git/PR workflow (after L86/L113 surgical removals); `todowrite.txt`'s 8-example demonstration block. Both are "compress for context budget" calls that need validation infrastructure to land safely.

## What would a good result look like?

`multiedit` and `codesearch` are removed (description + implementation + string-literal references + dispatcher gate). Existing descriptions stop teaching wrong behavior, lying about runtime semantics, or contradicting other PawWork project rules. The trash-vs-rm boundary becomes the obvious default rather than competing with the bash example. apply_patch's deletion syntax has a prose-level guardrail line. Effect on visible tool count for the default profile (Zen provider + non-GPT model + no LSP + no plan flag): zero net change at runtime — `codesearch` is already gated off for default Zen users, and `multiedit` was never registered. The cleanup matters by reducing dead code, lying descriptions, and incorrect pointers, not by altering what models see at runtime under default settings. Nothing requires new fixture infrastructure to land.

## Which audience does this matter to most?

Both

## Extra context — why this is a PawWork carveout, not an upstream feature request

opencode is shaped around dev-first defaults — its users routinely operate `commit`/`PR`/`gh CLI` as everyday vocabulary, and apply_patch + heavy delegation hype + Exa code search are net-positive there. PawWork's audience distribution is wider (covers technical and non-technical users; default tasks span documents, spreadsheets, code, research, life admin) and the default-task distribution is different. The same description text produces inconsistent quality on the two surfaces — not "right vs wrong", but a difference in product-positioning defaults.

Permanent carveout for `packages/opencode/src/tool/*.txt`: during upstream sync, always accept HEAD on these files. The underlying runtime (`tool.ts` dispatcher logic) keeps following upstream via graft + squash.

## Scope

Two PRs, each will land independently.

### PR1 — Description cleanup + multiedit removal (combined)

Combined because multiedit is dead code (not registered) and its removal carries the same risk profile as the .txt edits. Touches 8 .txt files + 5 source-code locations.

**Description edits (8 files):**

- **`task.txt`**: delete fictional example agents block (`code-reviewer`, `greeting-responder`, `<example_agent_descriptions>` through end of file) plus surrounding `Example usage (NOTE: ...)` framing. This explicitly includes the two `<example>` blocks at task.txt L30-49 and L51-57 which reference the fictional `code-reviewer`/`greeting-responder` agents — both must go. Delete Usage notes item 1 (`Launch multiple agents concurrently whenever possible`) and item 4 (`outputs should generally be trusted`); renumber remaining consecutively. Rewrite L12 `Other tasks that are not related to the agent descriptions above` → `Other tasks that are not related to the available agents`. Rewrite the Usage note that contains the phrase `agent description` (currently L21, `If the agent description mentions ...`) by content match (not by post-renumber index) to point at "the available agents". (Both rewrites use schema-independent phrasing — "available agents" — to avoid layout-dependent terms like "above" or "in this tool description".) No new schema anchor on L3.
- **`glob.txt`** L6: delete only the second sentence (`It is always better to speculatively perform multiple searches as a batch that are potentially useful.`). Keep the first sentence.
- **`skill.txt`** L1 AND L5: rewrite `listed in the system prompt` / `listed in your system prompt` → `listed below` in both lines. Literal-pointer fix; the list is appended at the end of the skill tool description by `describeSkill`, not in system prompt.
- **`bash.txt`**:
  - L16: replace the `rm` quoting example with a non-destructive command. Pick from `bash.txt`'s own existing examples (`mkdir`, `python`); do NOT use `cat` because L31's avoid-list explicitly forbids `cat` via Bash, which would be self-contradictory.
  - L31 already contains a structured nested sub-bullet list under "Avoid using Bash with..." (sub-bullets like `File search: Use Glob (NOT find or ls)`, `Content search: Use Grep (NOT grep or rg)`, `Read files: Use Read (NOT cat/head/tail)`, `Edit files: Use Edit (NOT sed/awk)`, `Write files: Use Write (NOT echo >/cat <<EOF)`). Add a NEW sub-bullet at the same nesting level paralleling those: `File deletion: Use the trash tool, NOT \`rm\`/\`del\` (rm permanently deletes; trash is reversible)`.
  - Delete L86 (`NEVER use the TodoWrite or Task tools`) and L113 (`DO NOT use the TodoWrite or Task tools`). Both lines surgical — rest of L51-L117 git/PR workflow block stays untouched (deferred).
- **`trash.txt`**: expand from 4 to ~7 lines. Diff must contain phrases mentioning (a) reversibility (Trash is recoverable), (b) `rm`/`del` contrast (those are permanent and unrecoverable), and (c) when to use trash (any file/dir deletion in user-facing flows). PR diff verification is mechanical: grep the diff for these three categories.
- **`todowrite.txt`** L1 AND L166: delete the `demonstrate thoroughness to the user` framing on L1, and delete the entire L166 line (`When in doubt, use this tool. Being proactive with task management demonstrates attentiveness and ensures you complete all requirements successfully.`).
- **`apply_patch.txt`**: NOT audit clean. Add an explicit guardrail line near the top: `Use \`*** Delete File:\` only when the user has explicitly asked to delete a specific file by path. For any heuristic or cleanup deletion (e.g. removing temp files, cleaning up after a refactor), use the trash tool, not this syntax.` Phrased as user-intent ("explicitly asked" vs "heuristic") rather than git-state ("tracked" vs "ad-hoc") because apply_patch operates on filesystem paths regardless of git tracking — models cannot reliably distinguish tracked from untracked. This is best-effort prose-level guardrail; runtime dispatcher-level enforcement (rejecting `*** Delete File:` headers) is out of scope — same upstream-foundation reasoning as keeping `apply_patch` itself.
- **`edit.txt`**: append a line-deletion hint as a new bullet at the end of the existing usage-notes list: `To delete a line cleanly, the \`oldString\` must include the trailing newline character. Otherwise only the line content is removed and a stray blank line remains at the deleted position.` One-line teaching addition (not a rewrite); fills the boundary-case gap that biases weaker models toward stray-blank-line bugs.
- **`lsp.txt`**: audit clean. Note in PR body.

**Multiedit removal (9 paths total — 5 src + 1 spec + 3 test files):**

- Delete `packages/opencode/src/tool/multiedit.ts`.
- Delete `packages/opencode/src/tool/multiedit.txt`.
- Edit `packages/opencode/src/permission/index.ts:297`: remove `"multiedit"` from `EDIT_TOOLS` array.
- Edit `packages/opencode/src/config/agent.ts:111`: remove `|| tool === "multiedit"` from the union check.
- Edit `packages/opencode/src/config/config.ts:733`: remove `|| tool === "multiedit"` from the union check.
- Edit `packages/opencode/specs/effect-migration.md:269`: remove the multiedit list entry (one-line removal).
- Edit `packages/opencode/test/agent/agent.test.ts:493`: rename test from `"legacy tools config maps write/edit/patch/multiedit to edit permission"` to `"legacy tools config maps write/edit/patch to edit permission"` (test logic unchanged).
- Edit `packages/opencode/test/config/config.test.ts:2006-2030`: delete the entire `test("migrates legacy multiedit tool to edit permission", ...)` block — multiedit-specific migration test, no longer applies after removal.
- Edit `packages/opencode/test/permission/next.test.ts:390-401`: remove `multiedit` from input array, remove `expect(result.has("multiedit")).toBe(true)` assertion, rename test name from `"disabled - disables edit/write/apply_patch/multiedit when edit denied"` to `"disabled - disables edit/write/apply_patch when edit denied"`.

### PR2 — Codesearch removal

- Delete `packages/opencode/src/tool/codesearch.txt`.
- Delete `packages/opencode/src/tool/codesearch.ts`.
- In `registry.ts:319-321`, edit the OR condition `tool.id === CodeSearchTool.id || tool.id === WebSearchTool.id` to remove **only** the `CodeSearchTool.id` arm. Post-edit shape: `if (tool.id === WebSearchTool.id) { return input.providerID === ProviderID.opencode || Flag.OPENCODE_ENABLE_EXA }`. Websearch's `providerID === ProviderID.opencode || Flag.OPENCODE_ENABLE_EXA` gating is preserved.
- Remove the `CodeSearchTool` import.
- Edit `packages/opencode/src/agent/agent.ts:185`: remove the `codesearch: "allow"` permission entry from the default permission ruleset (otherwise the entry references a deleted tool).
- Remove `Flag.OPENCODE_ENABLE_EXA` only after grepping every reference: `packages/**`, `tests/**`, `docs/**`, env-handling code, README/CHANGELOG, and any `.env*` template. If any non-codesearch reference exists, keep the flag.
- Remove tests that exercise codesearch.

### Out of scope

- **`apply_patch` removal**: not in scope. Real OpenAI product optimization for GPT-5+ string-replace hallucination — patch format with anchor lines is significantly more accurate. Removal would require dispatcher rewiring (`registry.ts:323-327` toggles `apply_patch` ↔ `edit/write` mutually exclusive when `Env.get("OPENCODE_E2E_LLM_URL")` is set OR `modelID.includes("gpt-")` and not `oss`/`gpt-4`; see `registry.ts:323-327` for the full condition), conflicting with the "stay close to upstream until 10K users" foundation commitment. Description cleanup of `apply_patch.txt` (the `*** Delete File` prose guardrail) IS in scope as part of PR1.
- **LSP**: handled in #232 (default permission deny + global setting toggle). Not touched here.
- **`bash.txt` ~64 lines git/PR workflow** (after L86/L113 removal): still deferred. Not teaching wrong behavior; pure context-budget compression that needs validation infrastructure.
- **`todowrite.txt` 8-example demonstration block**: still deferred for the same reason.
- **Smoke transcript upgrade**: reviewer suggestion declined as not a priority; revisit when adopting an autoresearch-style harness improvement framework.

## Preconditions

PR1 multiedit-removal preconditions (run before opening PR):

- [ ] `rg -nw 'multiedit' packages/ tests/ docs/ --glob '!docs/superpowers/**' ; rg -n '"multiedit"' packages/ tests/ docs/ --glob '!docs/superpowers/**' ; rg -n 'MultiEditTool' packages/ tests/ docs/ --glob '!docs/superpowers/**'` (three runs — word-boundary, quoted-form, AND CamelCase class name; `;` not `&&` because rg exits non-zero on no matches and `&&` would skip subsequent runs when one finds nothing; `--glob '!docs/superpowers/**'` excludes the local plan file which itself contains `multiedit` references) — list every reference; the 9 known paths (`tool/multiedit.ts`, `tool/multiedit.txt`, `permission/index.ts:297`, `config/agent.ts:111`, `config/config.ts:733`, `specs/effect-migration.md:269`, `test/agent/agent.test.ts:493`, `test/config/config.test.ts:2006-2030`, `test/permission/next.test.ts:390-401`) must all be removed/updated in this PR.

PR2 codesearch-removal preconditions (run before opening PR):

- [ ] `rg -n 'codesearch|CodeSearchTool' packages/ tests/ docs/` — list every reference; all must be removable.
- [ ] `rg -n 'OPENCODE_ENABLE_EXA' packages/ tests/ docs/ '**/.env*'` — list every reference; if any non-codesearch reference exists, keep the flag.
- [ ] Confirm `WebSearchTool.id` retains its gating behavior post-edit (manual diff inspection of the OR condition).

## Owner-local checklist (NOT in PR diffs; AGENTS.md is git-excluded)

These items are owner-only attestation, recorded here for completeness, and are NOT verifiable in PR diffs because AGENTS.md is excluded from git via `.git/info/exclude`:

- AGENTS.md `Engineering conventions` section opens with the new principle: `Improvements over upstream are allowed when there's a clear product reason. Strategic carveouts (UI rewrite, i18n zh+en only, tool descriptions) are permanent. One-off improvements (trash replacing rm, removing tools that don't fit PawWork) sit between fork and vendor — log in memory and review on each upstream sync, not as long-term divergence.`
- AGENTS.md long-term divergence list: existing `tool/*.txt` permanent carveout kept; add `Tool registry: codesearch removed` (for PR2) and `Tool registry: multiedit removed` (for PR1) entries when those PRs land.
- New memory file `project_tool_set_v1.md` records the decisions (multiedit deleted / codesearch deleted / apply_patch unchanged / LSP via #232) plus rationale.
- New memory follow-up signal: trash-vs-rm prose-level guardrail is best-effort; if real sessions show models still routinely reaching for `rm` despite PR1, escalate to bash-dispatcher-level rm interception (separate issue, separate scope).

## Acceptance criteria

### PR1 (verifiable in PR diff)

- [ ] `task.txt`: fictional agents block, `Example usage` framing, Usage 1 (`Launch multiple agents concurrently`), Usage 4 (`outputs should generally be trusted`) all removed. Remaining Usage notes renumbered with no orphan numbering.
- [ ] `task.txt` L12 dangling `above` reference rewritten as specified.
- [ ] `task.txt` Usage note containing `agent description` (currently L21) rewritten by content match to point at "the available agents" (schema-independent phrasing matching the PR1 scope description).
- [ ] `task.txt` L3 carries no new schema anchor.
- [ ] `glob.txt` L6: only second sentence removed; first sentence retained.
- [ ] `skill.txt` L1 AND L5: both wrong pointers rewritten to `listed below`.
- [ ] `bash.txt` L16: `rm` removed from quoting example; replaced with `mkdir` or `python` (NOT `cat`).
- [ ] `bash.txt` L31 bullet list: new `File deletion: Use the trash tool, NOT \`rm\`/\`del\`` bullet added.
- [ ] `bash.txt` L86 (`NEVER use the TodoWrite or Task tools`) AND L113 (`DO NOT use the TodoWrite or Task tools`) deleted; rest of git/PR workflow block untouched.
- [ ] `trash.txt` expanded; diff contains phrases mentioning (a) reversibility, (b) `rm`/`del` contrast, (c) when to use.
- [ ] `todowrite.txt` L1 (`demonstrate thoroughness`) AND L166 (`When in doubt, use this tool ... demonstrates attentiveness`) edits applied.
- [ ] `apply_patch.txt` adds the `*** Delete File:` prose guardrail line near the top.
- [ ] `edit.txt` appends the line-deletion hint as the last bullet, mentioning the trailing newline requirement.
- [ ] PR body notes audit result for `lsp.txt` (clean), `apply_patch.txt` (not clean → guardrail added), and `edit.txt` (teaching gap → hint added).
- [ ] `packages/opencode/src/tool/multiedit.ts` deleted.
- [ ] `packages/opencode/src/tool/multiedit.txt` deleted.
- [ ] `packages/opencode/src/permission/index.ts:297`: `"multiedit"` removed from `EDIT_TOOLS` array.
- [ ] `packages/opencode/src/config/agent.ts:111`: `|| tool === "multiedit"` removed.
- [ ] `packages/opencode/src/config/config.ts:733`: `|| tool === "multiedit"` removed.
- [ ] `packages/opencode/specs/effect-migration.md:269` multiedit list entry removed.
- [ ] `packages/opencode/test/agent/agent.test.ts:493` test renamed to `"legacy tools config maps write/edit/patch to edit permission"` (multiedit removed from name only; test body unchanged).
- [ ] `packages/opencode/test/config/config.test.ts` `test("migrates legacy multiedit tool to edit permission", ...)` block deleted entirely.
- [ ] `packages/opencode/test/permission/next.test.ts` test renamed (multiedit removed from name), `multiedit` removed from input array, `expect(result.has("multiedit")).toBe(true)` assertion removed.
- [ ] Post-deletion three-grep set (`rg -nw 'multiedit'`, `rg -n '"multiedit"'`, `rg -n 'MultiEditTool'`) all return 0 hits across `packages/ tests/ docs/ --glob '!docs/superpowers/**'`.
- [ ] `bun --cwd packages/opencode test` PASSES (the three modified test files all stay green).

### PR2 (verifiable in PR diff)

- [ ] `packages/opencode/src/tool/codesearch.ts` deleted.
- [ ] `packages/opencode/src/tool/codesearch.txt` deleted.
- [ ] `registry.ts:319-321` edited to remove only the `CodeSearchTool.id` arm; post-edit shape is `if (tool.id === WebSearchTool.id) { return ... }` (single condition, not OR).
- [ ] `CodeSearchTool` import removed.
- [ ] `packages/opencode/src/agent/agent.ts:185` `codesearch: "allow"` permission entry removed.
- [ ] `Flag.OPENCODE_ENABLE_EXA` removed only if no non-codesearch reference exists; otherwise flag preserved.
- [ ] All codesearch tests removed.

### Validation (lightweight, no fixture)

For each PR, one manual smoke run per model in a real PawWork session.

- [ ] Models: Opus, Sonnet, GLM-5.1, Kimi K2.6, Qwen Coder. Verify each model ID against the current PawWork model registry before running smoke (typo guard).
- [ ] Provider routing: at minimum one run on Zen-routed default provider AND one run on `ProviderID.opencode` (PR2 codesearch gate change must not regress the opencode-provider websearch path).
- [ ] PR1 smoke: a task that uses `task` dispatch; a deletion (verify trash route, not rm, and verify the model does NOT use apply_patch's `*** Delete File:` syntax for ad-hoc deletion either); a multi-edit-style change confirms repeated `edit` calls work; no model attempts to call `multiedit`; an edit-tool single-line deletion (verify the diff has no stray blank line at the deleted position — validates the new edit.txt hint changes weaker-model behavior).
- [ ] PR2 smoke: a programming question on Zen (no codesearch dependence) AND on `ProviderID.opencode` (websearch still gated correctly).
- [ ] Any model fabricating `subagent_type = "code-reviewer"` or attempting to call `multiedit`/`codesearch` is a regression to fix before merge.

### CI keyword gate: removed

The original CI keyword gate proposal is dropped. Heavy regex-grep on text files is the wrong shape for this kind of guardrail. Anti-regression is enforced by PR diff review plus the manual smoke list.

## Risks

- **R1 (model regression)** — substantially reduced from the original plan. Description cleanup removes content that was either lying (multiedit atomicity), pushing wrong tool selection (`code-reviewer` fictional agent, `speculatively batch`, `demonstrate thoroughness`), or contradicting project rules (NEVER TodoWrite during git). Models losing these hints should not regress.
- **R2 (permanent description carveout)** — unchanged; strategic product-positioning difference, accepted trade-off.
- **R8 (multiedit deletion breaks callers)** — split path: (a) success path — calling `edit` multiple times is functionally equivalent (slightly more verbose); (b) failure path — multiedit was never atomic in code despite the description claim, so removing it does not lose any safety property that actually existed at runtime. Note: GPT-5+ models on `usePatch=true` (`registry.ts:323-327`) already do not see `edit`/`write` (gated off when patch mode is active), so multiedit removal does not change their effective edit surface — they continue to use `apply_patch` as the single edit tool. Non-GPT models retain `edit`/`write` and lose only the multiedit sugar.
- **R8b (codesearch deletion breaks callers)** — PawWork users on default Zen never reached codesearch; opencode-provider users on the upstream surface are unaffected because the deletion does not propagate upstream (carveout).
- **R9 (one-off dispatcher divergence vs "10K users foundation" rule)** — Mitigation: AGENTS.md "Improvements over upstream are allowed when there's a clear product reason" principle codifies that one-off product-driven removal is acceptable; long-term divergence list logs codesearch and multiedit; on upstream sync we accept HEAD then re-remove rather than treat them as permanent carveouts.
- **R10 (websearch gate accidentally broken in PR2)** — codesearch and websearch share the same OR condition gate at `registry.ts:319-321`. Mitigation: PR2 acceptance criterion explicitly requires `WebSearchTool.id` arm remain intact, verifiable in diff. PR2 smoke includes a websearch run on `ProviderID.opencode` to catch any regression.
- **R11 (apply_patch `*** Delete File:` is the real bypass path; bash `rm` is already hard-blocked)** — bash `rm`/`rmdir`/`unlink`/etc. are denied at the permission layer (see "What you do today" note on `agent.ts:99-110`), so they cannot bypass trash. The actual gap is `apply_patch`: `*** Delete File:` syntax goes through the `apply_patch` permission, not the `bash` permission, so the `rm *: deny` rule does not apply. Note: PawWork's permission grammar is per-tool (e.g. `bash: { "rm *": "deny" }`), not per-payload-pattern; adding something like `apply_patch: { "*** Delete File:*": "deny" }` is NOT a simple permission-rule change — it would require either new grammar support or dispatcher-level interception of patch envelope contents. The PR1 prose guardrail in apply_patch.txt is best-effort instruction to the model, not runtime enforcement. Accepted trade-off because dispatcher-level rejection of `*** Delete File:` carries the same upstream-foundation cost as removing apply_patch outright. Escalate only if real sessions show models routinely using `*** Delete File:` to delete user files.

## Precedent

- #130 — model-specific behavior prompts removed; same pattern, completed.
- The shipped `trash` tool replacing `rm` is the prior one-off product improvement over upstream cited in the new "Improvements over upstream" principle.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Tool set cleanup: rewrite descriptions, remove redundant tools, harden deletion guardrails #129

What task are you trying to do?

What do you do today?

What would a good result look like?

Which audience does this matter to most?

Extra context — why this is a PawWork carveout, not an upstream feature request

Scope

PR1 — Description cleanup + multiedit removal (combined)

PR2 — Codesearch removal

Out of scope

Preconditions

Owner-local checklist (NOT in PR diffs; AGENTS.md is git-excluded)

Acceptance criteria

PR1 (verifiable in PR diff)

PR2 (verifiable in PR diff)

Validation (lightweight, no fixture)

CI keyword gate: removed

Risks

Precedent

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature] Tool set cleanup: rewrite descriptions, remove redundant tools, harden deletion guardrails #129

Description

What task are you trying to do?

What do you do today?

What would a good result look like?

Which audience does this matter to most?

Extra context — why this is a PawWork carveout, not an upstream feature request

Scope

PR1 — Description cleanup + multiedit removal (combined)

PR2 — Codesearch removal

Out of scope

Preconditions

Owner-local checklist (NOT in PR diffs; AGENTS.md is git-excluded)

Acceptance criteria

PR1 (verifiable in PR diff)

PR2 (verifiable in PR diff)

Validation (lightweight, no fixture)

CI keyword gate: removed

Risks

Precedent

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions