You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PawWork's runtime is vendored from opencode and will stay that way until scale justifies deeper divergence. Where PawWork diverges permanently, the preferred pattern is not to fork the runtime but to adjust the harness layer: product system prompt, base tool descriptions, session mechanics, and loop or diagnostic observability. These adjustments share one direction: fewer surfaces that assume a developer audience, better defaults for weaker models, and clearer recovery when things go wrong.
What belongs in this series
A change belongs here when it touches any of:
packages/opencode/src/session/prompt/*.txt or nearby prompt-composition paths, including provider system prompts and bundled product instructions.
packages/opencode/src/tool/*.txt or other model-visible tool-description surfaces.
Session-level behavior such as plan approval, question routing, subagent wiring, loop detection, or diagnostics.
Global instruction injection, such as packages/opencode/src/session/instruction.ts and the bundled pawwork.txt loading path.
Base-tool exposure decisions, plugin boundaries, and other changes that alter what the model sees or prefers by default.
A change does not belong here when it only affects UI (ui), the Electron shell (desktop), or CI (ci), even if it touches agent behavior indirectly.
Series themes
1. System prompt unification and product instruction architecture
Goal: one PawWork-owned behavior surface across providers and models, with product rules living in the right layer instead of being scattered through model-family branches.
Follow-up prompt-layer cleanups that keep unfamiliar-tool guidance and product behavior in the system or project instruction layer rather than duplicating workflow text inside generic tools.
2. Tool semantics and prompt optimization
Goal: make base tools easier for weaker models to use correctly, remove instructions that teach wrong behavior, and tighten the boundaries between nearby tools.
3. Tool surface reduction and boundary simplification
Goal: keep the default tool surface small and strong, and decide which capabilities should stay first-class versus move behind plugins or a narrower default surface.
[Feature] Move advanced low-frequency tools to plugins #131 Move advanced low-frequency tools to plugins — closed as not planned for now. The plugin-tool-registration path is too broad for the current harness series, and the base-tool surface should be revisited only with fresh evidence from real sessions.
Open direction question: should tools such as grep and glob remain standalone base tools, move behind plugins, or be progressively replaced by stronger Bash guidance and fewer default tools?
4. Web access and source safety defaults
Goal: search when the task needs current or external evidence, but treat fetched content as untrusted input and prefer high-quality sources.
These are already implied by the themes above, but are called out here because they are easy to lose between issues.
bash.txt should not embed GitHub workflow. A generic terminal tool description should not carry commit, PR, or inline-review workflow tutorials as if every PawWork user were a frequent gh user.
Unfamiliar-command guidance should live above Bash. The preferred rule is: when the model is not confident about a CLI surface such as GitHub CLI, first check gh <command> --help instead of guessing API shape or arguments. This belongs in the system or project instruction layer, not as a large embedded workflow inside bash.txt.
Tool consolidation direction is still open, but not currently planned as plugin migration work. Moving tools behind plugins requires a broader plugin SDK tool-registration surface. Keep this as a design question until real sessions show the current base surface is the bottleneck.
High-friction workflow ergonomics may still need dedicated helpers. If repeated real sessions show that workflows like PR inline review remain error-prone even after prompt cleanup, we may need narrower helpers instead of expecting models to build everything from raw gh api usage.
Do not turn model intent mistakes into automatic harness repair. PawWork should make the failure layer clear, as in [Feature] Add structured tool failure reasons for agent recovery #439, but avoid guessing paths, commands, filenames, or other semantic intent on the model's behalf.
Why this series exists
PawWork's runtime is vendored from opencode and will stay that way until scale justifies deeper divergence. Where PawWork diverges permanently, the preferred pattern is not to fork the runtime but to adjust the harness layer: product system prompt, base tool descriptions, session mechanics, and loop or diagnostic observability. These adjustments share one direction: fewer surfaces that assume a developer audience, better defaults for weaker models, and clearer recovery when things go wrong.
What belongs in this series
A change belongs here when it touches any of:
packages/opencode/src/session/prompt/*.txtor nearby prompt-composition paths, including provider system prompts and bundled product instructions.packages/opencode/src/tool/*.txtor other model-visible tool-description surfaces.packages/opencode/src/session/instruction.tsand the bundledpawwork.txtloading path.A change does not belong here when it only affects UI (
ui), the Electron shell (desktop), or CI (ci), even if it touches agent behavior indirectly.Series themes
1. System prompt unification and product instruction architecture
Goal: one PawWork-owned behavior surface across providers and models, with product rules living in the right layer instead of being scattered through model-family branches.
2. Tool semantics and prompt optimization
Goal: make base tools easier for weaker models to use correctly, remove instructions that teach wrong behavior, and tighten the boundaries between nearby tools.
3. Tool surface reduction and boundary simplification
Goal: keep the default tool surface small and strong, and decide which capabilities should stay first-class versus move behind plugins or a narrower default surface.
grepandglobremain standalone base tools, move behind plugins, or be progressively replaced by stronger Bash guidance and fewer default tools?4. Web access and source safety defaults
Goal: search when the task needs current or external evidence, but treat fetched content as untrusted input and prefer high-quality sources.
5. Session control, approval, and recovery
Goal: let the model pause appropriately before risky work, ask better questions, and stop visible spinning when progress is low.
6. Export and replay observability
Goal: make local session exports useful enough to explain failures without adding remote telemetry or a dashboard.
stopCurrent known gaps
These are already implied by the themes above, but are called out here because they are easy to lose between issues.
bash.txtshould not embed GitHub workflow. A generic terminal tool description should not carry commit, PR, or inline-review workflow tutorials as if every PawWork user were a frequentghuser.gh <command> --helpinstead of guessing API shape or arguments. This belongs in the system or project instruction layer, not as a large embedded workflow insidebash.txt.gh apiusage.Active work
Maintenance
This issue stays open as a living index for the harness series. When filing new harness work:
harnessTask items auto-check when sub-issues close via GitHub native sub-issue linkage.
Precedent
Prior closed work worth citing as precedent: