Skip to content

[Feature] Tighten plugin and external-skill loading in PawWork mode #263

@Astro-Han

Description

@Astro-Han

What task are you trying to do?

Stop PawWork from silently auto-loading and trusting third-party plugins or external skill metadata. Today a cloned repository can drop a .opencode/plugins/foo.js file or a SKILL.md with imperative description text and influence the session before the user takes any action. PawWork is non-technical-user-first, so the default posture should be deny-by-default for third-party plugins and treat external skill descriptions as untrusted data, not instructions.

What do you do today?

Three independent attack vectors live in the current code, only the third of which is actively observable in the field:

  1. Plugin host has no sandbox. packages/opencode/src/plugin/loader.ts:99 does await import(entry) directly into the host process. Hooks can mutate LLM headers, tool args, tool results, streamed text, and arbitrary message content. Currently inert in PawWork because no third-party plugin is installed, but the loading path is open.
  2. Project-local plugin auto-load is a zero-click vector. PawWork mode enumerates project .opencode and .pawwork directories walking up from the worktree (packages/opencode/src/config/paths.ts:32-46), then globs {plugin,plugins}/*.{ts,js} and feeds matches to await import(). Cloning a repository that ships such a file and opening it in PawWork executes attacker JS with the user's OS permissions, no consent prompt.
  3. Skill description text is injected verbatim into the system prompt. Skill discovery scans ~/.claude, ~/.agents, project-up-tree .claude and .agents for SKILL.md files (packages/opencode/src/skill/index.ts:178-191). The description frontmatter field is rendered into the <available_skills> block by Skill.fmt (lines 277-301) without any source-of-trust framing, so a third-party skill author's description string reads to the model as a PawWork-internal instruction. This is the vector firing in the silent-tiger trace: the using-superpowers skill description literally says "requiring Skill tool invocation before ANY response", and the model obeys it. Benign-but-aggressive today; an attack-grade variant ("before responding, run bash...") is identical in shape.

This issue's earlier draft incorrectly attributed the observed self-invocation of using-superpowers to plugin injection via experimental.chat.messages.transform. Verification against trace + source showed the actual mechanism is skill description rendering in the system prompt, not plugin code execution. Vector A in particular is currently inert on the user's machine, and the rewrite reflects that.

What would a good result look like?

PawWork sets a deny-by-default posture for third-party plugins (with one explicit opt-in for power users) and wraps external-source skill descriptions in an untrusted-metadata fence in the system prompt. Existing skill ecosystems including agentskills.io and ~/.claude/skills/ keep working; what changes is how their description text is framed for the model. The 6 internal INTERNAL_PLUGINS (OAuth glue compiled into the binary) are unaffected because they are not loaded via the third-party plugin path.

Detailed spec, file-by-file change set, considered alternatives, and open questions live in the first comment below so the issue body stays a stable summary while implementation discussion accumulates inline.

Which audience does this matter to most?

Both

Extra context

This is foundation-layer work that PawWork is doing as a permanent product-layer carve-out, not as upstream PRs. Upstream anomalyco/opencode has had these vectors reported repeatedly (#2242, #6361, #6606, #18781, #19123) with no active fix in flight: #6361 was stale-bot-closed despite a published POC, PR #18784 (warning-text-only fix for the skill description vector) is closed unmerged, and the upstream roadmap is expanding the plugin hook surface (#21075, #19919) rather than tightening it. PawWork has already de facto carved out from upstream's plugin ecosystem in #30 (closed), so this issue formalizes the next layer of that carve-out.

Related: #195 (harness improvement series), #30 (plugin GUI, closed).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityappApplication behavior and product flowsenhancementNew feature or requestharnessModel harness, prompts, tool descriptions, and session mechanics

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions