-
Notifications
You must be signed in to change notification settings - Fork 0
feat: sandbox security improvements (auth proxy, gVisor default, 4-domain policy, Chainguard packages) #696
Description
Context
Four sandbox security improvements identified from deep-dive research across Docker Sandboxes, NVIDIA OpenShell, and LangSmith Sandboxes.
Finding 1: Auth Proxy Pattern
LangSmith Sandboxes + NVIDIA OpenShell Privacy Router: All external service calls routed through authentication proxy so credentials never enter the sandbox runtime. Three independent implementations converged on this pattern.
Finding 2: Chainguard Packages
Chainguard OS Packages: 30k zero-CVE OS packages for building custom hardened images. Extends current Chainguard distroless usage for sandbox images.
Finding 3: gVisor Default for High-Risk Tools
Deep-dive comparison found gVisor (runsc) provides syscall-level isolation between container and MicroVM strength. DockerSandboxConfig.runtime field already exists -- just need to change factory default for code_execution and terminal tool categories.
Finding 4: 4-Domain Policy Model
Adapted from OpenShell's declarative YAML policy engine. Extend DockerSandboxConfig with a SandboxPolicy model covering filesystem/network/process/inference domains. The inference domain (rerouting LLM calls) directly maps to the auth proxy pattern. Hot-reload via SettingsChangeDispatcher.
Deep Dive Verdict
Keep current sandbox architecture + add these four targeted upgrades. All external options evaluated (Docker Sandboxes, OpenShell, LangSmith) are either Linux-incompatible for MicroVM, alpha-stage, or hosted-only SaaS. Pattern adoption, not wholesale replacement.
Action Items
- gVisor default (1-2 days): Change factory default to
"runsc"forcode_executionandterminalinfactory.py. Add health-check fallback to"runc"if gVisor unavailable. - Auth proxy / SandboxCredentialManager (3-5 days): Audit all
env_overridescall sites passing provider API keys. Design abstraction routing LLM-bound traffic from containers through SynthOrg's provider layer. Credentials never enter sandbox env. - SandboxPolicy 4-domain model (2-3 days): Extend
DockerSandboxConfigwith Pydantic model for filesystem/network/process/inference domains. Backwards-compatible migration of existingnetwork,allowed_hosts,dns_allowed,loopback_allowedfields. - Chainguard Packages evaluation: Evaluate for sandbox image when custom packages needed beyond standard distroless variants.
References
- LangSmith Sandboxes
- Chainguard OS Packages
- NVIDIA OpenShell -- Privacy Router + 4-domain policy
- Docker Sandboxes Architecture -- MicroVM reference (macOS only, no Linux)
- Deep dive: sandbox architecture comparison (2026-03-22)
Additional Research (2026-03-26)
Reward Hacking Categories Reference
Source: LongCat-Flash-Prover (arXiv:2603.21065)
9 categories of reward hacking / agent cheating discovered during formal theorem proving, directly applicable as a security rule engine reference checklist:
- Proposition tampering -- agent modifies the problem statement
- Early termination -- agent declares success prematurely
- Unproven assumptions -- agent asserts facts without evidence
- Context pollution -- agent injects misleading context
- Command injection -- agent embeds commands in data
- Tautological proofs -- agent proves trivially true statements
- Circular reasoning -- agent uses conclusion as premise
- Scope reduction -- agent solves a simpler version of the problem
- Format exploitation -- agent exploits output format expectations
AST-based legality detection validates agent-generated code before execution. Pattern applicable to our security rule engine for validating tool call outputs.