-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
[Feature]: Mount skill directories read-only in sandbox containers #17931
Copy link
Copy link
Closed
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.ClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.enhancementNew feature or requestNew feature or requestimpact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.Security boundary, credential, authz, sandbox, or sensitive-data risk.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.ClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.enhancementNew feature or requestNew feature or requestimpact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.Security boundary, credential, authz, sandbox, or sensitive-data risk.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
Skill directories are copied into the writable sandbox workspace, allowing agents to modify skill files; they should be bind-mounted read-only instead.
Problem to solve
Skills are instructions, they never themselves should need to be writable. It is impossible to enforce any runtime security if the very instruction sets agents are given can be modified by attacking code.
Skill directories are writable by the agent in all sandbox
workspaceAccessmodes:"rw": The host workspace is mounted read-write at/workspace. Skills at<workspace>/skills/are directly writable through this mountsyncSkillsToWorkspaceis not called (src/agents/sandbox/context.ts:48, guard: if (cfg.workspaceAccess !== "rw"))"ro": The host workspace is mounted read-only, but skills are copied into the sandbox workspace viasyncSkillsToWorkspace(src/agents/sandbox/context.ts:50-54)"none": No host workspace mounted"ro"In every case, a sandboxed agent can modify skill files. The exec tool doesn't bypass read-only mounts (those are kernel-enforced), but it doesn't need to: skills simply aren't in any read-only mount. They're either part of a writable host mount (rw) or writable copies in the sandbox workspace (ro/none).
Proposed solution
Mount each skill directory as a read-only bind mount in all
workspaceAccessmodes:"rw"mode: The host workspace is still mounted read-write at /workspace, but individual skill directories are overlaid with :ro mounts at /workspace/skills/. Docker specific-path mounts take precedence over broader mounts, so the agent can write to /workspace but not to /workspace/skills/."ro"/"none"modes: Skip thesyncSkillsToWorkspacecopy entirely. Mount skill source directories directly from the host with :ro. No copies, no writable duplicates.This gives kernel-enforced skill immutability regardless of workspace access mode.
The infrastructure for this already exists:
Alternatives considered
wrapToolWorkspaceRootGuardinsrc/agents/pi-tools.read.tsto reject writes to skills/ pathsexecaccess can bypass JavaScript path checks via shell commands. Still useful as a complementary measure for non-sandboxed agentsImpact
Affected users/systems:
All users who have enabled sandboxing (agents.defaults.sandbox.mode: "non-main" or "all"). These users have explicitly opted into isolation -- they expect the sandbox to be a security boundary. This gap means their assumed security posture is stronger than their actual posture.
Severity:
This is not a crash or a UX annoyance. It is a gap in the security model that is invisible to the operator. There is no warning or error when an agent modifies a skill file. The operator has no indication it happened.
Frequency:
Always present
Consequences:
Agent self-modification: An agent can rewrite skill instructions mid-session to alter its own behavior, bypass operator-intended guardrails encoded in skills, or escalate its own capabilities.
Cross-skill tampering: An agent can modify one skill's scripts to affect behavior triggered by another skill's instructions. The operator sees two independent skills but the agent has merged them.
Persistence risk: If the sandbox container is reused across sessions (depending on sandbox lifecycle config), modified skill copies persist into future sessions. The agent's modifications outlive the session that made them.
Evidence/examples
Docker read-only mounting itself was designed with this exact sort of security issue in mind.
MCP incidentally provides the same sort of concept - an agent can call MCP - it cannot alter the MCP server.
Additional information
This could easily be an option in
openclaw.json(skillReadOnly: true?) rather than enforced as a breaking change, depending on desired impact.