Skip to content

fix(rest): honour copy_in(overwrite=False) end-to-end#691

Open
G4614 wants to merge 3 commits into
boxlite-ai:mainfrom
G4614:fix/rest-copy-in-overwrite-false
Open

fix(rest): honour copy_in(overwrite=False) end-to-end#691
G4614 wants to merge 3 commits into
boxlite-ai:mainfrom
G4614:fix/rest-copy-in-overwrite-false

Conversation

@G4614

@G4614 G4614 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

copy_in(overwrite=False) silently clobbers existing guest files

Bug

box.copy_in(host, dst, copy_options=CopyOptions(overwrite=False)) over
REST has no effect — the SDK accepts the flag but the runner extracts
the tar with O_TRUNC regardless:

seed /root/v.txt = "guest-original"
copy_in("host-replacement", "/root/v.txt", CopyOptions(overwrite=False))
cat /root/v.txt
  → ""   (truncated by the extract, contents lost)

Root cause: the option is dropped at three seams across the SDK chain:

  • SDK Rust REST client doesn't serialise it onto the request.
  • Runner file-upload handler doesn't parse it off the query string.
  • C FFI has only boxlite_copy_into(...) (no overwrite arg) — the bool
    is hard-coded true at the FFI boundary, so even local-FFI callers
    that opted out got an overwrite-everything copy.

Fix

Plumb the option end-to-end. ABI-stable on the C surface so existing
callers keep working unchanged:

Layer Change
C FFI new boxlite_copy_into_with_options(...); existing boxlite_copy_into kept (defaults overwrite=true)
Go SDK new CopyIntoWithOptions(ctx, src, dst, CopyIntoOptions{Overwrite: bool})
Runner file-upload handler reads ?overwrite=false query param
SDK Rust REST client appends &overwrite=false when caller sets it

Defaults unchanged: old callers still get overwrite=true — the
historical behaviour. Stacks on #688: the underlying tar-extract path
needs to exist before overwrite=False has a place to refuse.

Test plan — two-sided verified

Pin: scripts/test/e2e/cases/test_files_io.py::test_copy_in_overwrite_false_rejects_conflict

Stack: local e2e. C lib + runner + Go SDK static + Python wheel all rebuilt and hot-swapped.

Pre-fix Post-fix
seed /root/v.txt='guest-original', then copy_in('host-replacement', '/root/v.txt', CopyOptions(overwrite=False)), then cat /root/v.txt '' (silently clobbered to empty by tar extract + O_TRUNC) 'guest-original' — refusal honoured by libboxlite
default overwrite=True callers (existing PR #688 paths) unchanged unchanged

Build: make dist:c ✓, cd apps/runner && go build ./cmd/runner ✓. Note that this PR's runner build requires the refreshed libboxlite.a from make dist:c because of the new C-FFI symbol — make dist:c handles staging.

Summary by CodeRabbit

  • New Features

    • File uploads now support explicit overwrite control (default: overwrite=true; set overwrite=false to preserve existing files).
    • Uploads may target either a single file or a directory—both are handled automatically.
  • Bug Fixes

    • Improved upload safety: incoming archives are securely staged and extracted to prevent path-traversal issues.

G4614 added 2 commits June 9, 2026 05:24
The SDK Rust REST client sends `copy_in` as a tar archive (Content-Type:
application/x-tar) over PUT /boxes/{id}/files. The runner controller
was saving the body to `<tmp>/boxlite-upload-*.tar` and then handing
that path straight to `Boxlite.CopyInto`, which treats `src` as a real
filesystem path. End result: the guest got a single binary file
containing the raw tar header + payload at the destination path —
silently breaking both single-file and directory uploads.

  e.g. `box.copy_in("/host/hello.txt", "/root/hello.txt")` →
       guest cat /root/hello.txt prints
       `hello.txt<NUL>0100664…ustar…<payload>` instead of the payload.

Fix: stream the request body through a `tar.Reader` into a staging dir,
then call CopyInto with the extracted path. Two output shapes:

  - single regular file in archive → CopyInto(<extracted-file>, dest)
    so the guest sees the file at dest exactly as the caller requested.
  - anything else (multi-file, directories, symlinks)
    → CopyInto(<staging-dir>, dest), letting the Go SDK's recursive
    copy handle the tree.

Hardening: rejects entries with absolute paths or `../` components
(zip-slip). Single-file detection is pessimistic — any non-regular
entry (dir, symlink, link, device) demotes the upload to the
"copy a directory" path.

E2E regression test pinned:
  sdks/python/tests/e2e/test_files_io.py::test_copy_in_text_roundtrips_byte_exact
  sdks/python/tests/e2e/test_files_io.py::test_copy_in_directory_include_parent_false
Before this change `box.copy_in(host, dst, copy_options=CopyOptions(
overwrite=False))` over REST silently clobbered existing guest files —
the option was dropped at three seams between the SDK and libboxlite:

  1. SDK Rust `copy_into` ignored `_opts: CopyOptions` entirely;
  2. Runner upload handler had no way to tell the Go SDK to refuse
     to overwrite; its `r.Boxlite.CopyInto` took no options;
  3. The C FFI `boxlite_copy_into` hard-coded `overwrite=true` in
     `default_copy_options()`, so even if the Go SDK could pass an
     override, the C side wouldn't honour it.

Plumbed end to end:

  - C FFI : adds `boxlite_copy_into_with_options(handle, src, dst,
            overwrite, cb, user_data, err)` next to the existing
            `boxlite_copy_into` (kept for ABI continuity). Both call
            the new `box_copy_into` internal that now takes
            `CopyOptions` as an argument.
  - Go SDK: adds `CopyIntoOptions{Overwrite bool}` and a
            `CopyIntoWithOptions` method. `CopyInto` becomes a thin
            wrapper that forwards `Overwrite: true` to keep existing
            callers byte-compatible.
  - Runner: client's `CopyIntoWithOptions(...)` and the file-upload
            handler reads `overwrite=false` from the query string
            (set by the SDK Rust REST client below).
  - SDK Rust REST: `copy_into` puts `&overwrite=false` on the URL
            when CopyOptions says so. Default omits the param so
            servers built against the old contract behave unchanged.

E2E regression test:
  sdks/python/tests/e2e/test_files_io.py::test_copy_in_overwrite_false_rejects_conflict
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 896189ca-f939-4481-96a2-3c164469400a

📥 Commits

Reviewing files that changed from the base of the PR and between d0a438c and bd4f353.

📒 Files selected for processing (2)
  • sdks/c/src/copy.rs
  • src/boxlite/src/rest/litebox.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/boxlite/src/rest/litebox.rs
  • sdks/c/src/copy.rs

📝 Walkthrough

Walkthrough

This PR threads explicit overwrite control through copy-in paths across REST, C, Go SDK, and runner client layers, and refactors upload handling to extract tar streams into a staging directory with path traversal checks before invoking copy with options.

Changes

Overwrite-aware copy-in flow

Layer / File(s) Summary
REST copy option usage
src/boxlite/src/rest/litebox.rs
copy_into now uses CopyOptions.overwrite and conditionally adds overwrite=false to the REST upload URL query.
C ABI and Rust copy plumbing
sdks/c/include/boxlite.h, sdks/c/src/copy.rs
Adds boxlite_copy_into_with_options with overwrite, updates internal helper signatures to accept CopyOptions, and forwards provided options into async lite.copy_into.
Go SDK copy options wiring
sdks/go/copy.go
Introduces CopyIntoOptions and CopyIntoWithOptions; existing CopyInto now delegates with defaults. cgo call switches to boxlite_copy_into_with_options and passes opts.Overwrite.
Runner client overwrite method
apps/runner/pkg/boxlite/client.go
Adds client-level CopyIntoWithOptions that forwards overwrite policy to SDK copy, and updates CopyInto docs to clarify default overwrite behavior.
Upload controller staged extraction
apps/runner/pkg/api/controllers/boxlite_files.go
BoxliteFileUpload now parses overwrite query input, extracts tar payloads to a staging directory, selects single-file vs directory source, and calls CopyIntoWithOptions. New helper validates tar entry paths against traversal/absolute escapes and tracks single regular-file archives.

Sequence Diagram

sequenceDiagram
  participant U as Client
  participant C as BoxliteFileUpload
  participant E as extractTarToDir
  participant S as StagingDir
  participant O as CopyIntoWithOptions
  participant R as RestBox

  U->>C: PUT files tar stream with overwrite query
  C->>C: Parse overwrite parameter
  C->>C: Create temporary staging directory
  C->>E: Extract tar stream into staging directory
  E->>S: Write safe entries and preserve relative paths
  E-->>C: Return extractedPath and isSingleFile
  C->>O: Copy using selected source and overwrite flag
  O->>R: Invoke copyInto with options
  R-->>O: Copy result
  O-->>C: Success
  C-->>U: 204 No Content
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I thumped through tar and tidy trees,
guarding paths from sneaky breeze,
I packed an overwrite carrot bright,
and passed it layer-by-layer right,
now files hop in, crisp and neat! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: enabling the copy_in overwrite=False option to work end-to-end across the REST API, which is the core bug being fixed.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@G4614 G4614 marked this pull request as ready for review June 9, 2026 12:23
@G4614 G4614 requested a review from a team as a code owner June 9, 2026 12:23
@G4614 G4614 enabled auto-merge June 9, 2026 12:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sdks/c/src/copy.rs (1)

15-24: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix Rust formatting to pass CI.

The formatting check failed on the function call. Expand the arguments across multiple lines.

✨ Suggested formatting fix
 ) -> BoxliteErrorCode {
-    box_copy_into(handle, host_src, guest_dst, default_copy_options(), cb, user_data, out_error)
+    box_copy_into(
+        handle,
+        host_src,
+        guest_dst,
+        default_copy_options(),
+        cb,
+        user_data,
+        out_error,
+    )
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sdks/c/src/copy.rs` around lines 15 - 24, The call to box_copy_into inside
the unsafe extern "C" function boxlite_copy_into is on a single line and fails
rustfmt; reformat the function body so the box_copy_into(...) invocation has its
arguments each on their own line (or grouped across multiple lines) for
readability and to satisfy CI formatting checks, keeping the same arguments:
handle, host_src, guest_dst, default_copy_options(), cb, user_data, out_error
and returning the result.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/boxlite/src/rest/litebox.rs`:
- Around line 225-229: The conditional constructing path in function using
opts.overwrite, box_id, and encoded_dst is written on one line and fails
rustfmt; rewrite the if/else block to use multi-line blocks with proper
indentation (each branch on its own indented line) so the two format! calls are
on separate lines inside their respective { } blocks, e.g. keep the if
opts.overwrite { ... } else { ... } structure but place the format! invocations
on their own lines and align braces consistently.

---

Outside diff comments:
In `@sdks/c/src/copy.rs`:
- Around line 15-24: The call to box_copy_into inside the unsafe extern "C"
function boxlite_copy_into is on a single line and fails rustfmt; reformat the
function body so the box_copy_into(...) invocation has its arguments each on
their own line (or grouped across multiple lines) for readability and to satisfy
CI formatting checks, keeping the same arguments: handle, host_src, guest_dst,
default_copy_options(), cb, user_data, out_error and returning the result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0a167626-9fc3-42a0-87b1-6901afc17e13

📥 Commits

Reviewing files that changed from the base of the PR and between e4bbd0e and d0a438c.

📒 Files selected for processing (6)
  • apps/runner/pkg/api/controllers/boxlite_files.go
  • apps/runner/pkg/boxlite/client.go
  • sdks/c/include/boxlite.h
  • sdks/c/src/copy.rs
  • sdks/go/copy.go
  • src/boxlite/src/rest/litebox.rs

Comment thread src/boxlite/src/rest/litebox.rs
DorianZheng pushed a commit that referenced this pull request Jun 10, 2026
#710)

add 5 e2e cases pinning REST contracts not covered today

## Coverage gaps

Three classes of behaviour run through the SDK → API → Runner → VM
chain but had no e2e pin on the chain itself; bugs in any one of them
would silently regress through `make test:integration:*` (which only
exercises local-FFI):

```
1. CLI detach lifecycle — `boxlite run -d` returns, CLI process exits,
   does a fresh CLI process still see / exec the box?
   FFI side  src/boxlite/tests/detach.rs, recovery.rs  ✓
   E2E side  scripts/test/e2e/cases/                    ✗
2. Execution.attach / reattach contract — bogus exec_id should be a
   typed client error; reattach to a completed exec should return a
   usable handle.
   Runner side  apps/runner/.../boxlite_exec_attach_test.go  ✓
   SDK <-> API  end-to-end                                    ✗
3. host bind mount via REST — the cloud runtime intentionally dropped
   host bind mounts (#639); REST callers passing host paths must
   silently no-op, no /mnt/<x> in the guest.
   FFI surface  src/boxlite/tests/mount_security.rs  ✓
   REST contract                                      ✗
```

## Cases shipped

```
scripts/test/e2e/cases/test_cli_detach_recovery.py   (2 cases)
scripts/test/e2e/cases/test_exec_attach.py           (2 cases)
scripts/test/e2e/cases/test_volume_readonly.py       (1 case)
```

5 cases total. Author also dropped 6 cases from this branch's earlier
incarnation that were already committed alongside their respective
fix PRs (#686 / #688 / #689 / #691 / #692 / #696), so the diff is
strictly net-new coverage.

## Test plan — run against current main

Stack: local e2e, runner unchanged, no source edits.

| Case | Result | Notes |
|---|---|---|
| `test_detached_box_exec_propagates_exit_code_on_fresh_cli` | ✅ PASS |
exit-code passthrough across CLI processes |
| `test_detached_box_survives_cli_exit_and_is_reusable` | ⚠️ XFAIL
(strict) | reaches step 3 (`boxlite exec <id> echo still-alive`) then
hits the stdout-drop race that #563 fixes — marker drops when #563 lands
|
| `test_attach_with_bogus_id_is_typed_error` | ✅ PASS | bogus exec_id →
typed `Exception` (not 5xx, not silent) |
| `test_reattach_after_original_completes` | ⚠️ XFAIL (strict) | same
stdout-drop race (#563) on the original exec's `out=='first-output'`
assertion |
| `test_host_bind_mount_via_rest_is_silently_ignored` | ✅ PASS | the box
created with `volumes=[(host_dir, "/mnt/ro", True)]` reports
`MOUNT_LINE=<none>` from `/proc/mounts` and the host marker file is
untouched — REST silently dropped the host path |

The 2 XFAILs are tied to #563 (`fix(go-sdk): fold stream drain into
Execution.Wait`). Once #563 merges, both xfails flip xpass-strict, the
markers come off, and the suite is 5/5 green. No additional fix work is
needed in this PR.

🤖 Generated with [Claude Code](https://claude.com/claude-code)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end tests verifying CLI detach survival and box
reusability across fresh CLI processes, including exit code propagation
tests.
* Added end-to-end tests for SDK reattach functionality, validating
session state after execution completion.
* Added end-to-end test confirming proper host bind mount handling
during REST execution.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant