Skip to content

feat(guest): add auto-idmap for transparent volume UID remapping#399

Merged
DorianZheng merged 2 commits into
mainfrom
worktree-twinkly-popping-thimble
Mar 20, 2026
Merged

feat(guest): add auto-idmap for transparent volume UID remapping#399
DorianZheng merged 2 commits into
mainfrom
worktree-twinkly-popping-thimble

Conversation

@DorianZheng

Copy link
Copy Markdown
Member

Summary

  • Auto-detect UID mismatch between host volume owner and container USER
  • Transparently remap UIDs using kernel mount_setattr(MOUNT_ATTR_IDMAP) via open_tree + move_mount
  • Zero user configuration — fully automatic, no VolumeSpec API changes
  • Graceful fallback when kernel doesn't support idmap (volume works with original UIDs)

How it works

Host: stat(volume_path) → owner_uid=501
  ↓ proto BindMount.owner_uid
Guest: Container::start()
  resolve_user(image) → container_uid=1000
  if 501 != 1000 → idmap::remap_mount(501→1000)
  mount point now shows uid=1000 transparently

Files

Area Files Change
Core guest/src/storage/idmap.rs NEW: remap_mount() via open_tree + mount_setattr + move_mount
Guest guest/src/container/lifecycle.rs Auto-idmap after resolve_user(), before OCI bundle
Proto service.proto owner_uid/owner_gid on BindMount
Host plumbing types.rs, vmm_spawn.rs, container_volume.rs, container.rs Pass owner_uid from stat through pipeline
Tests mount_security.rs, unit tests in idmap.rs and types.rs Feasibility + unit + integration

Test plan

  • 205 unit/integration tests pass (pre-push hook)
  • Integration test proves mount_setattr(MOUNT_ATTR_IDMAP) works on virtiofs bind mount clones (kernel 6.12)
  • Python example: examples/python/02_features/mount_host_dir.py — mount + ls + read/write verified
  • E2E with non-root USER image (e.g., node:20) to verify auto-remap changes uid from 501→1000

When a container runs as a non-root user (e.g., USER node uid=1000) and
host-mounted volumes are owned by a different UID (e.g., macOS uid=501),
file access fails with permission denied.

This adds automatic ID-mapped mount support using the kernel's new mount
API (open_tree + mount_setattr + move_mount). The guest detects UID
mismatch between host volume owner and container user, then transparently
remaps UIDs at the bind mount level — no chown, no user configuration.

Implementation:
- guest/src/storage/idmap.rs: remap_mount() using kernel mount_setattr
  with MOUNT_ATTR_IDMAP (proven working via integration test on kernel
  6.12 with CONFIG_USER_NS=y)
- guest/src/container/lifecycle.rs: auto-idmap in Container::start()
  after resolve_user(), before OCI bundle creation
- Host stat(host_path) passes owner_uid/gid through proto BindMount
  to guest for reliable UID detection (avoids virtiofs xattr issues)
- Graceful fallback: if kernel doesn't support idmap, volume works
  with original UIDs (no error)

Zero user-facing API changes — fully automatic.
The initial single-entry UID mapping (e.g., 501→0, count=1) caused all
unmapped UIDs to show as overflow (65534). This is because the kernel
maps any UID not covered by the userns uid_map to nobody.

Fix: generate a full-range swap mapping that covers all 65536 UIDs.
For example, swapping 501↔0 produces:
  0   501  1      (swap)
  1   1    500    (identity)
  501 0    1      (reverse swap)
  502 502  65034  (identity)

This ensures no UID overflows while correctly remapping the target pair.

Verified working:
- Root container (USER=0): host uid=501 → container uid=0 ✓
- Non-root user (USER=1000): host uid=501 → container uid=1000 ✓
- Read-only volume: no idmap applied ✓
- Host file ownership preserved after guest writes ✓

Also adds:
- build_swap_mapping() with 5 unit tests
- Python test examples for volume mounting and auto-idmap
@DorianZheng DorianZheng merged commit 8ae679d into main Mar 20, 2026
17 checks passed
@DorianZheng DorianZheng deleted the worktree-twinkly-popping-thimble branch March 20, 2026 20:05
DorianZheng added a commit that referenced this pull request Mar 22, 2026
… logging (#399)

Add opt-in security layers for sandbox isolation:

## Network Allowlist (NetworkSpec::Restricted)
- DNS sinkhole blocks hostname resolution for non-allowed hosts
- Rust proxy intercepts all outbound TCP via gvproxy DialFunc patch
- Supports exact hostname, wildcard, IP, and CIDR rules
- Zero overhead when not enabled (default: NetworkSpec::Isolated)

## Secret Substitution (BoxOptions.secrets)
- Secrets never exposed as env vars (guest sees placeholder)
- Real values substituted transparently via TLS MITM on outbound HTTPS
- Per-box ephemeral CA with per-host cert generation (rcgen + rustls)
- Host-scoped: each secret declares which hosts may receive it

## Audit Logging (LiteBox::audit_log())
- Records lifecycle, execution, and file transfer events
- Bounded ring buffer (default 1000 events, configurable)
- Thread-safe, zero-copy event recording

## Architecture
- boxlite-proxy crate: filter, MITM, cert gen (13 tests)
- boxlite/src/audit/: event types and recorder (7 tests)
- 20-line vendor patch to gvproxy for DialFunc TCP interception
- Go DNS filter with zone-based sinkhole (11 tests)
- Python SDK: SecretSpec, NetworkPolicy, audit_log() bindings
- Each layer only activates when user opts in via BoxOptions
DorianZheng added a commit that referenced this pull request Mar 22, 2026
… logging (#399)

Add opt-in security layers for sandbox isolation:

## Network Allowlist (NetworkSpec::Restricted)
- DNS sinkhole blocks hostname resolution for non-allowed hosts
- Rust proxy intercepts all outbound TCP via gvproxy DialFunc patch
- Supports exact hostname, wildcard, IP, and CIDR rules
- Zero overhead when not enabled (default: NetworkSpec::Isolated)

## Secret Substitution (BoxOptions.secrets)
- Secrets never exposed as env vars (guest sees placeholder)
- Real values substituted transparently via TLS MITM on outbound HTTPS
- Per-box ephemeral CA with per-host cert generation (rcgen + rustls)
- Host-scoped: each secret declares which hosts may receive it

## Audit Logging (LiteBox::audit_log())
- Records lifecycle, execution, and file transfer events
- Bounded ring buffer (default 1000 events, configurable)
- Thread-safe, zero-copy event recording

## Architecture
- boxlite-proxy crate: filter, MITM, cert gen (13 tests)
- boxlite/src/audit/: event types and recorder (7 tests)
- 20-line vendor patch to gvproxy for DialFunc TCP interception
- Go DNS filter with zone-based sinkhole (11 tests)
- Python SDK: SecretSpec, NetworkPolicy, audit_log() bindings
- Each layer only activates when user opts in via BoxOptions
DorianZheng added a commit that referenced this pull request Mar 22, 2026
… logging (#399)

Add opt-in security layers for sandbox isolation:

## Network Allowlist (NetworkSpec::Restricted)
- DNS sinkhole blocks hostname resolution for non-allowed hosts
- Rust proxy intercepts all outbound TCP via gvproxy DialFunc patch
- Supports exact hostname, wildcard, IP, and CIDR rules
- Zero overhead when not enabled (default: NetworkSpec::Isolated)

## Secret Substitution (BoxOptions.secrets)
- Secrets never exposed as env vars (guest sees placeholder)
- Real values substituted transparently via TLS MITM on outbound HTTPS
- Per-box ephemeral CA with per-host cert generation (rcgen + rustls)
- Host-scoped: each secret declares which hosts may receive it

## Audit Logging (LiteBox::audit_log())
- Records lifecycle, execution, and file transfer events
- Bounded ring buffer (default 1000 events, configurable)
- Thread-safe, zero-copy event recording

## Architecture
- boxlite-proxy crate: filter, MITM, cert gen (13 tests)
- boxlite/src/audit/: event types and recorder (7 tests)
- 20-line vendor patch to gvproxy for DialFunc TCP interception
- Go DNS filter with zone-based sinkhole (11 tests)
- Python SDK: SecretSpec, NetworkPolicy, audit_log() bindings
- Each layer only activates when user opts in via BoxOptions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant