Explore per-agent worker isolation, VFS scratch storage, and host filesystem capabilities

## Summary

Long-term architecture note: explore whether OpenClaw should combine per-agent worker/process isolation, a virtual filesystem layer, and the existing real-host filesystem safety work into a more explicit filesystem capability model.

This is not a request to replace the current `fs-safe`/pinned-path helpers immediately. The current implementation is solving a real gap in Node's filesystem APIs. The goal here is to track adjacent OSS work and define experiments that could reduce risk, reduce Python helper dependency over time, and give each agent a clearer filesystem boundary.

## Current OpenClaw Problem Space

OpenClaw has a long-lived gateway with many agents. Different agents may need different workspace roots in the same overall gateway runtime.

We currently have user-visible operations that eventually need safe real-host filesystem mutation:

- agent patch/apply operations
- workspace file set/create/update operations
- memory/session export writes
- shell/pi host write/edit operations
- archive/media staging copies
- sandbox file mutations such as write, mkdir, remove, rename

These are host filesystem operations, not only scratch-state operations. They need to resist path traversal, symlink/hardlink tricks, and TOCTOU races.

Node's standard `fs` APIs do not provide the full `openat`/dirfd-style capability interface we would want for this. That is why the current code has pinned path/write helpers and, on POSIX, Python helper paths for specific operations where Python exposes `dir_fd`-style primitives that Node does not.

## Why Node Permissions Alone Are Not Enough

Node has a permission model:

- `--permission`
- `--allow-fs-read`
- `--allow-fs-write`
- `--allow-worker`

This is useful as an outer process or worker sandbox, but it is not the same abstraction as a per-root filesystem capability object.

Limitations for OpenClaw's shape:

- It is process/worker launch policy, not an object API like `root.write("path")`.
- It is coarse-grained for a long-lived gateway containing many agents with different roots.
- Workers need correctly configured `execArgv`; trusted parent code must enforce that every time.
- `--allow-worker` is explicitly risky because worker creation can weaken the model if exposed incorrectly.
- Node documents permission-model constraints around symlinks, existing file descriptors, native modules, subprocesses, worker threads, inspector, and WASI.
- It does not provide pinned dirfd/openat-style operations for safe real-host mutations.

So Node permissions may be useful as an additional outer guard, but should not be treated as a replacement for a capability-safe host filesystem API.

Reference: https://nodejs.org/api/permissions.html

## Relevant OSS References

### Platformatic Regina

Repo: https://github.com/platformatic/regina
License: Apache-2.0

Regina is a multi-agent orchestrator built on Platformatic Watt. It discovers markdown agent definitions and spawns each agent as an isolated application thread.

Relevant ideas:

- per-agent application/thread lifecycle
- idle suspension/resume
- per-agent state storage
- cross-pod migration/state backup
- built-in tools backed by a per-instance virtual filesystem

This is highly relevant architecturally, but it does not directly solve safe host-folder mutation. Regina's default file tools operate inside a virtual filesystem.

### Platformatic Runtime / Watt

Repo: https://github.com/platformatic/platformatic
License: Apache-2.0

Platformatic/Watt really does run applications in Node worker threads. The runtime creates workers with per-application `execArgv`, and its runtime config supports filesystem permissions that are converted into Node permission flags.

Relevant ideas:

- worker-thread application isolation
- per-application `execArgv`
- worker lifecycle, health, restart, and management APIs
- thread interceptor / mesh routing
- optional process mode for heavier isolation
- application-level permissions based on Node's permission model

This could inspire an OpenClaw experiment where agents run in separate workers/processes with an outer permission boundary.

### @platformatic/vfs

Package: https://www.npmjs.com/package/@platformatic/vfs
Repo: https://github.com/platformatic/vfs
License: MIT

`@platformatic/vfs` is described as a virtual filesystem for Node.js, a userland shim for `node:vfs`. Regina uses it with a SQLite provider for per-agent VFS state.

Relevant ideas:

- per-agent virtual scratch/state filesystem
- SQLite-backed persistence
- a filesystem-like API that can back tools without exposing host paths
- possible future alignment if Node gains a native VFS layer

This is complementary to `fs-safe`, not a direct replacement. It can make many agent tool operations avoid the host filesystem entirely, but real workspace edits still need safe host mutation.

### fs-safe

Site: https://fs-safe.io/
Repo/package local context: `../fs-safe`

`fs-safe` is the host-filesystem capability layer extracted from OpenClaw security work. Its role is closer to Go's `os.Root` / Rust `cap-std`: pass around a root capability and perform safe relative operations beneath it.

This remains the more direct fit for OpenClaw workspace edits and other real-host file operations.

### Go os.Root

Reference: https://go.dev/blog/osroot

Go added `os.Root` to address traversal-resistant filesystem access. This is the general shape OpenClaw wants for real-host filesystem access: rooted operations, not ad hoc string prefix checks.

### Rust cap-std

Repo: https://github.com/bytecodealliance/cap-std

Rust's `cap-std` is another object-capability filesystem design. Useful reference for API shape and threat model.

### Linux openat2 option

Package: https://www.npmjs.com/package/@cocalc/openat2

`@cocalc/openat2` exposes Linux `openat2`/dirfd-based primitives via a native addon. It may be useful as an optional Linux fast path for `fs-safe`, but it is not portable enough to be the whole answer.

## Possible Direction

Layered model:

```text
outer runtime boundary      worker/process permissions, env sanitization, subprocess policy
virtual agent filesystem    per-agent scratch/state VFS, possibly @platformatic/vfs-style
host capability filesystem  fs-safe root objects for real workspace/media/archive mutations
```

The key distinction:

- virtual FS is great for agent-local state and tool scratch files
- real-host FS capability is still needed for workspace edits, media staging, archive extraction, and integrations that must touch real files

## Experiments To Try

- Prototype one agent running in a worker with per-worker `execArgv` permission flags.
- Prototype the same with process mode and compare isolation, startup cost, memory, and operational complexity.
- Try a per-agent VFS scratch area for default tool state that does not need to touch the host workspace.
- Evaluate whether `@platformatic/vfs` can back any existing OpenClaw scratch/session/tool paths without changing user-visible semantics.
- Keep real workspace edits on `fs-safe` and measure how much Python helper usage remains after moving scratch-only paths to VFS.
- Explore an optional Linux `openat2` fast path inside `fs-safe` while keeping portable POSIX/macOS behavior.
- Document which operations require real host paths versus virtual/scratch paths.
- Define an API boundary so extensions/plugins receive capabilities rather than raw host paths whenever feasible.

## Questions

- Which OpenClaw file operations truly need real host paths?
- Which operations are only agent scratch/state and could move into VFS?
- Should an agent's default shell/write/edit tools operate in VFS by default, with explicit capability grants for host workspace edits?
- Can worker/process permissions act as defense-in-depth without making gateway lifecycle or debugging too complex?
- Can `fs-safe` become the shared host capability layer used by both OpenClaw and external consumers?
- How do we preserve current OpenClaw UX where agents edit real repos, while giving each agent a clearer boundary?

## Non-Goals For Now

- Do not replace `fs-safe` with Node permissions.
- Do not migrate all agent execution to Platformatic/Regina.
- Do not make virtual filesystems the only storage model for workspace edits.
- Do not remove Python helpers until there is equivalent portable safety proof.
- Do not add a large dependency to core without measuring install/runtime cost and ownership impact.

## Success Criteria

A future design would be successful if it:

- preserves real workspace editing behavior
- reduces direct raw-path handling in agent/plugin code
- narrows or removes Python helper usage where safely possible
- gives each agent an explicit filesystem boundary
- uses Node permissions or process isolation only as defense-in-depth
- keeps extension/plugin dependency ownership clean
- provides measurable performance and security improvements
- has regression tests for symlink, hardlink, traversal, rename, copy, remove, and TOCTOU-style races

## License Notes

- Platformatic Regina: Apache-2.0
- Platformatic Runtime/Watt: Apache-2.0
- `@platformatic/vfs`: MIT
- `@cocalc/openat2`: MIT
- Rust `cap-std`: Apache-2.0 WITH LLVM-exception OR Apache-2.0 OR MIT depending crate; verify exact component before reuse

## Initial Assessment

This seems worth tracking as a long-term vision item. The strongest path is probably not "replace `fs-safe`", but:

1. keep `fs-safe` as the real-host filesystem capability layer;
2. add a VFS/scratch layer where real host writes are unnecessary;
3. optionally wrap agents in worker/process permission boundaries for defense-in-depth;
4. reduce Python helpers only when an equally safe native/Node/optional-addon path exists.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explore per-agent worker isolation, VFS scratch storage, and host filesystem capabilities #78096

Summary

Current OpenClaw Problem Space

Why Node Permissions Alone Are Not Enough

Relevant OSS References

Platformatic Regina

Platformatic Runtime / Watt

@platformatic/vfs

fs-safe

Go os.Root

Rust cap-std

Linux openat2 option

Possible Direction

Experiments To Try

Questions

Non-Goals For Now

Success Criteria

License Notes

Initial Assessment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Explore per-agent worker isolation, VFS scratch storage, and host filesystem capabilities #78096

Description

Summary

Current OpenClaw Problem Space

Why Node Permissions Alone Are Not Enough

Relevant OSS References

Platformatic Regina

Platformatic Runtime / Watt

@platformatic/vfs

fs-safe

Go os.Root

Rust cap-std

Linux openat2 option

Possible Direction

Experiments To Try

Questions

Non-Goals For Now

Success Criteria

License Notes

Initial Assessment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions