skill-sandbox

The missing security layer for Claude Skills.

skill-sandbox validates any SKILL.md folder and runs its scripts/ in a restricted subprocess — no network by default, wall-time + virtual-memory limits, JSONL audit log per call. Stdlib-only, cross-platform, MIT.

Different from paid-skills: paid-skills is the monetization runtime (host SKILL.md, charge per use). skill-sandbox is the safety runtime (run any third-party Skill locally without trusting the author). They compose — you can host a Skill via paid-skills inside a Skill-sandbox subprocess for defense in depth.

Install

pip install skill-sandbox

Why

In 2026 the Skills ecosystem is exploding — 25+ platforms, 40K+ skills, but no one ships a sandbox. You're expected to npx skills add <random-author> and trust that the scripts/ folder isn't going to exfiltrate your data, fork-bomb, or mine crypto on your laptop.

skill-sandbox is the missing layer:

$ skill-sandbox validate ~/Downloads/random-skill
✅ Valid Skill: random-skill
   description: Summarizes a URL using the page title + first 200 chars.
   path: /Users/me/Downloads/random-skill
   scripts:     1 file(s)
   references:  0 file(s)
   assets:      0 file(s)

   Script names:
     - summarize_url.py

$ skill-sandbox run ~/Downloads/random-skill summarize_url.py -- https://example.com
https://example.com → Example Domain (fetched + summarized)

$ skill-sandbox audit ~/Downloads/random-skill summarize_url.py -- https://example.com 2>>audit.jsonl
{"ts":"2026-06-27T...","skill":"random-skill","script":"summarize_url.py","exit_code":0,"duration_ms":142,...}

What it does

Validates any SKILL.md against the agentskills.io spec — name (≤64 chars, lowercase), description (≤1024 chars, must explain what + when), required scripts/ folder.
Runs the Skill's scripts in a subprocess with:
- No network access by default (HTTP_PROXY/HTTPS_PROXY/NO_PROXY blanked) — opt-in with --allow-network.
- Wall-time limit (default 30s).
- Virtual-memory cap (default 512 MB; enforced via RLIMIT_AS on Unix).
- Pinned working directory to scripts/.
Captures stdout, stderr, exit code, duration, request_id, network_allowed flag.
Emits JSONL audit log per call (one record, one line) to stderr — ready for jq, Loki, Datadog, or a SIEM.
Cross-platform: Linux + macOS + Windows. Memory cap is best-effort on macOS/Windows (kernel-enforced on Linux).

Use

# Validate a skill
skill-sandbox validate <path>

# Run a script (just stdout/stderr + exit code)
skill-sandbox run <path> <script> [-- args...]

# Run + emit JSONL audit record to stderr
skill-sandbox audit <path> <script> [-- args...]

# List the scripts in a Skill
skill-sandbox list <path>

Python API

from skill_sandbox import SkillSandbox, validate_skill, SkillValidationError

try:
    meta = validate_skill("./my-skill")
except SkillValidationError as exc:
    print(f"Bad skill: {exc.message}")
    raise

sandbox = SkillSandbox(
    skill_path="./my-skill",
    timeout=10,           # wall-time limit
    memory_mb=256,        # virtual-memory cap (Unix)
    allow_network=False,  # default: blocked
)
result = sandbox.run_script("summarize_url.py", args=["https://example.com"])

print(f"exit={result.exit_code} duration={result.duration_ms}ms")
print(f"stdout: {result.stdout}")
print(f"timed_out: {result.timed_out}")

# One-line JSONL for log shippers
print(result.to_jsonl(), file=sys.stderr, flush=True)

Architecture

┌────────────────────────────────────────┐
│  skill-sandbox run                      │  ← CLI
│       │                                │
│       ▼                                │
│  ┌─────────────┐                       │
│  │ validator   │ → SKILL.md parse     │
│  └──────┬──────┘   + spec check        │
│         │                              │
│         ▼                              │
│  ┌─────────────┐                       │
│  │ sandbox.py  │ → subprocess.run     │  ← core
│  │             │   - env: empty proxies│
│  │             │   - cwd: scripts/     │
│  │             │   - timeout: 30s      │
│  │             │   - RLIMIT_AS: 512MB  │
│  └──────┬──────┘                       │
│         │                              │
│         ▼                              │
│   {stdout, stderr,                     │
│    exit_code, duration_ms,             │
│    request_id, network_allowed,        │
│    memory_limit_mb, timed_out}         │  ← SkillRunResult
│         │                              │
│         ▼                              │
│   .to_jsonl() → audit.jsonl            │
└────────────────────────────────────────┘

Stdlib-only by design — the sandbox is a security primitive; pulling in httpx / requests would defeat the point. Cross-platform via subprocess.run + resource (Unix only).

Compared to alternatives

Tool	Network policy	Resource limits	Audit log	Cross-platform	stdlib-only
`bash <script>`	✗ none	✗ none	✗	✓	✓
Docker `--network=none`	✓	✓	✗	✓	✗
`firejail`	✓	✓	✗	✗ (Linux only)	✗
`subprocess + resource`	partial	✓	✗	✓	✓
`skill-sandbox`	✓	✓	✓	✓	✓

The audit log + the per-skill request_id correlation are the unique parts — every Skill invocation produces a structured log line that downstream tooling (Loki, Datadog, your own audit pipeline) can ingest.

Roadmap

v0.2 — optional Docker backend for stricter isolation
v0.3 — policy files (skill-sandbox.toml) per-Skill override
v0.4 — Prometheus /metrics endpoint
v1.0 — signed-skill verification (cosign-style attestations)

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
src/skill_sandbox		src/skill_sandbox
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

skill-sandbox

Install

Why

What it does

Use

Python API

Architecture

Compared to alternatives

Roadmap

License

See also

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

skill-sandbox

Install

Why

What it does

Use

Python API

Architecture

Compared to alternatives

Roadmap

License

See also

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages