Skip to content

fix(rest): forward exec user= field across SDK → API → Runner → C FFI#686

Draft
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:fix/rest-exec-user-field
Draft

fix(rest): forward exec user= field across SDK → API → Runner → C FFI#686
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:fix/rest-exec-user-field

Conversation

@G4614

@G4614 G4614 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

box.exec(..., user="65534") over REST silently dropped the field — guest exec always ran as uid 0 regardless of caller intent. Local-FFI mode honoured it; REST chain didn't. Five layers had their own ExecRequest/ExecutionOptions struct that lacked user; all five now plumb it through to the C FFI's already-present user field.

Test plan — two-sided verified

Pin: scripts/test/e2e/cases/test_exec_user.py

Stack: local e2e (postgres + redis + API + runner + VM). Runner binary swapped between main and PR builds; SDK wheel rebuilt for each side.

Case Pre-fix (main) Post-fix (this PR)
test_exec_user_numeric_uid (box.exec("id",["-u"],user="65534") stdout) `'0
'` — uid stayed root **`'65534
'`** — uid propagated
test_exec_user_uid_gid_pair (echo $(id -u):$(id -g) with user="65534:65534") `'0:0
'` **`'65534:65534
'`**
test_exec_user_invalid_is_typed_error (user="bogus-name") DID NOT RAISE — bogus user silently ignored runner correctly rejects with User 'bogus-name' not found in /etc/passwd BUT the runner wraps it in HTTP 500 instead of 4xx

The third case is out of scope for this PR — it requires extending classifyExecError in apps/runner/pkg/api/controllers/boxlite_exec.go to map spawn_failed: User not found to 400. Filed as follow-up; this PR's contract (user= reaches the guest) is met.

Build:

  • make dist:c
  • cd apps/runner && go build ./cmd/runner
  • make dev:python ✓ (rebuilds the PyO3 binding wheel with the new ExecRequest field)

Before this change `box.exec(..., user="65534")` over the REST chain
silently dropped the field — the guest exec always ran as uid 0 (root)
regardless of what the caller asked for. Local-FFI mode honoured the
field; the divergence only appeared on REST.

Five layers had to be touched because each one had its own
ExecutionOptions / ExecRequest struct that lacked the field:

  - SDK Rust  : src/boxlite/src/rest/types.rs
                ExecRequest gained `user: Option<String>`;
                `from_command` now copies `cmd.user`.
  - API DTO   : apps/api/src/boxlite-rest/dto/exec.dto.ts
                Added `user?` for swagger / validation. The actual
                proxy is transparent (createProxyMiddleware) so the
                DTO is documentation, not parsing — but keeping it in
                sync with the runner schema avoids "the schema lied
                to me" debugging on the next consumer.
  - Runner Go : apps/runner/pkg/api/controllers/boxlite_exec.go
                ExecRequest gained `User *string`; pulled through to
                StartOptions when non-nil.
  - Runner Go : apps/runner/pkg/boxlite/exec_manager.go
                StartOptions + ExecutionOptions kwarg now plumbs User
                into bx.StartExecution.
  - Go SDK    : sdks/go/exec.go
                ExecutionOptions gained `User string`; the C-FFI
                command struct's `user` field (previously hard-coded
                nil) now reflects cfg.User.

The C FFI already accepts `user` (see `sdks/c/src/exec/command.rs`);
no work needed below the cgo seam.

Verified by `make dist:c` + runner rebuild. Stack restart for runtime
verification is gated to operator.

E2E regression test that pins this:
  sdks/python/tests/e2e/test_exec_user.py::test_exec_user_numeric_uid
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c1287b6f-5661-4948-af5a-6291537de410

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

The fix in the preceding commit closes the gap this test exercises.
Mirrors the layout of `scripts/test/e2e/cases/` on main.
G4614 added a commit to G4614/boxlite that referenced this pull request Jun 9, 2026
test_exec_user.py → boxlite-ai#686, test_network_allow_net.py → boxlite-ai#687,
test_files_io.py → boxlite-ai#688, test_box_metrics.py → boxlite-ai#689,
test_snapshot_clone.py → boxlite-ai#694, test_images_pull_list.py → boxlite-ai#696.

Remaining 3 cases (test_exec_attach.py, test_volume_readonly.py,
test_cli_detach_recovery.py) stay here because they pin REST-path gaps
that don't have a matching fix PR in this session — they document the
contract for future work to land against.
DorianZheng pushed a commit that referenced this pull request Jun 10, 2026
#710)

add 5 e2e cases pinning REST contracts not covered today

## Coverage gaps

Three classes of behaviour run through the SDK → API → Runner → VM
chain but had no e2e pin on the chain itself; bugs in any one of them
would silently regress through `make test:integration:*` (which only
exercises local-FFI):

```
1. CLI detach lifecycle — `boxlite run -d` returns, CLI process exits,
   does a fresh CLI process still see / exec the box?
   FFI side  src/boxlite/tests/detach.rs, recovery.rs  ✓
   E2E side  scripts/test/e2e/cases/                    ✗
2. Execution.attach / reattach contract — bogus exec_id should be a
   typed client error; reattach to a completed exec should return a
   usable handle.
   Runner side  apps/runner/.../boxlite_exec_attach_test.go  ✓
   SDK <-> API  end-to-end                                    ✗
3. host bind mount via REST — the cloud runtime intentionally dropped
   host bind mounts (#639); REST callers passing host paths must
   silently no-op, no /mnt/<x> in the guest.
   FFI surface  src/boxlite/tests/mount_security.rs  ✓
   REST contract                                      ✗
```

## Cases shipped

```
scripts/test/e2e/cases/test_cli_detach_recovery.py   (2 cases)
scripts/test/e2e/cases/test_exec_attach.py           (2 cases)
scripts/test/e2e/cases/test_volume_readonly.py       (1 case)
```

5 cases total. Author also dropped 6 cases from this branch's earlier
incarnation that were already committed alongside their respective
fix PRs (#686 / #688 / #689 / #691 / #692 / #696), so the diff is
strictly net-new coverage.

## Test plan — run against current main

Stack: local e2e, runner unchanged, no source edits.

| Case | Result | Notes |
|---|---|---|
| `test_detached_box_exec_propagates_exit_code_on_fresh_cli` | ✅ PASS |
exit-code passthrough across CLI processes |
| `test_detached_box_survives_cli_exit_and_is_reusable` | ⚠️ XFAIL
(strict) | reaches step 3 (`boxlite exec <id> echo still-alive`) then
hits the stdout-drop race that #563 fixes — marker drops when #563 lands
|
| `test_attach_with_bogus_id_is_typed_error` | ✅ PASS | bogus exec_id →
typed `Exception` (not 5xx, not silent) |
| `test_reattach_after_original_completes` | ⚠️ XFAIL (strict) | same
stdout-drop race (#563) on the original exec's `out=='first-output'`
assertion |
| `test_host_bind_mount_via_rest_is_silently_ignored` | ✅ PASS | the box
created with `volumes=[(host_dir, "/mnt/ro", True)]` reports
`MOUNT_LINE=<none>` from `/proc/mounts` and the host marker file is
untouched — REST silently dropped the host path |

The 2 XFAILs are tied to #563 (`fix(go-sdk): fold stream drain into
Execution.Wait`). Once #563 merges, both xfails flip xpass-strict, the
markers come off, and the suite is 5/5 green. No additional fix work is
needed in this PR.

🤖 Generated with [Claude Code](https://claude.com/claude-code)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end tests verifying CLI detach survival and box
reusability across fresh CLI processes, including exit code propagation
tests.
* Added end-to-end tests for SDK reattach functionality, validating
session state after execution completion.
* Added end-to-end test confirming proper host bind mount handling
during REST execution.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant