Skip to content

[Feature]: Mount skill directories read-only in sandbox containers #17931

@tasaankaeris

Description

@tasaankaeris

Summary

Skill directories are copied into the writable sandbox workspace, allowing agents to modify skill files; they should be bind-mounted read-only instead.

Problem to solve

Skills are instructions, they never themselves should need to be writable. It is impossible to enforce any runtime security if the very instruction sets agents are given can be modified by attacking code.

Skill directories are writable by the agent in all sandbox workspaceAccess modes:

  • "rw": The host workspace is mounted read-write at /workspace. Skills at <workspace>/skills/ are directly writable through this mount
    • syncSkillsToWorkspace is not called (src/agents/sandbox/context.ts:48, guard: if (cfg.workspaceAccess !== "rw"))
  • "ro": The host workspace is mounted read-only, but skills are copied into the sandbox workspace via syncSkillsToWorkspace (src/agents/sandbox/context.ts:50-54)
    • The sandbox workspace is writable (it's the agent's working directory), so the copies are writable.
  • "none": No host workspace mounted
    • Skills are copied into the writable sandbox workspace, same as "ro"

In every case, a sandboxed agent can modify skill files. The exec tool doesn't bypass read-only mounts (those are kernel-enforced), but it doesn't need to: skills simply aren't in any read-only mount. They're either part of a writable host mount (rw) or writable copies in the sandbox workspace (ro/none).

Proposed solution

Mount each skill directory as a read-only bind mount in all workspaceAccess modes:

-v /home/user/.openclaw/skills/my-skill:/workspace/skills/my-skill:ro
-v /path/to/workspace/.agents/skills/git-skill:/workspace/skills/git-skill:ro
  • "rw" mode: The host workspace is still mounted read-write at /workspace, but individual skill directories are overlaid with :ro mounts at /workspace/skills/. Docker specific-path mounts take precedence over broader mounts, so the agent can write to /workspace but not to /workspace/skills/.
  • "ro"/"none" modes: Skip the syncSkillsToWorkspace copy entirely. Mount skill source directories directly from the host with :ro. No copies, no writable duplicates.
    This gives kernel-enforced skill immutability regardless of workspace access mode.

The infrastructure for this already exists:

  • docker.binds supports :ro mounts (src/agents/sandbox/docker.ts:318-322)
  • parseSandboxBindMount correctly parses the :ro flag (src/agents/sandbox/fs-paths.ts:26-53)
  • Workspace mounts already use :ro when workspaceAccess === "ro" (src/agents/sandbox/docker.ts:345-347)
  • Docker handles overlapping mounts correctly -- /workspace/skills/my-skill:ro is kernel-enforced read-only even if /workspace is writable

Alternatives considered

  • Application-level write guard: Extend wrapToolWorkspaceRootGuard in src/agents/pi-tools.read.ts to reject writes to skills/ paths
    • Weaker -- an agent with exec access can bypass JavaScript path checks via shell commands. Still useful as a complementary measure for non-sandboxed agents
  • Hash verification: Hash skill files at load time, verify before use. Detects tampering but doesn't prevent it. A detection-only measure.
  • Sandbox workspace read-only: Make the entire sandbox workspace read-only
    • Too restrictive -- agents need to write working files during sessions
  • Keep the copy, mark copied dir read-only via chmod
    • Possible but fragile -- the agent runs as the same user inside the container and chmod is reversible - by contrast kernel-enforced bind mount :ro is not bypassable from inside the container

Impact

Affected users/systems:

All users who have enabled sandboxing (agents.defaults.sandbox.mode: "non-main" or "all"). These users have explicitly opted into isolation -- they expect the sandbox to be a security boundary. This gap means their assumed security posture is stronger than their actual posture.

Severity:

This is not a crash or a UX annoyance. It is a gap in the security model that is invisible to the operator. There is no warning or error when an agent modifies a skill file. The operator has no indication it happened.

Frequency:

Always present

Consequences:

Agent self-modification: An agent can rewrite skill instructions mid-session to alter its own behavior, bypass operator-intended guardrails encoded in skills, or escalate its own capabilities.

Cross-skill tampering: An agent can modify one skill's scripts to affect behavior triggered by another skill's instructions. The operator sees two independent skills but the agent has merged them.

Persistence risk: If the sandbox container is reused across sessions (depending on sandbox lifecycle config), modified skill copies persist into future sessions. The agent's modifications outlive the session that made them.

Evidence/examples

Docker read-only mounting itself was designed with this exact sort of security issue in mind.

MCP incidentally provides the same sort of concept - an agent can call MCP - it cannot alter the MCP server.

Additional information

This could easily be an option in openclaw.json (skillReadOnly: true?) rather than enforced as a breaking change, depending on desired impact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.enhancementNew feature or requestimpact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions