Skip to content

[feature] Add user volume support with resolved paths and read-only option#2

Merged
DorianZheng merged 1 commit into
mainfrom
feature/user-volume
Dec 10, 2025
Merged

[feature] Add user volume support with resolved paths and read-only option#2
DorianZheng merged 1 commit into
mainfrom
feature/user-volume

Conversation

@DorianZheng

Copy link
Copy Markdown
Member

No description provided.

Signed-off-by: dorianzheng <xingzhengde72@gmail.com>
@DorianZheng DorianZheng merged commit b226962 into main Dec 10, 2025
4 checks passed
@DorianZheng DorianZheng deleted the feature/user-volume branch December 10, 2025 07:51
DorianZheng added a commit that referenced this pull request May 9, 2026
Bundles two adversarial-review rounds on the post-and-drain async
callback C/Go SDK (PR #495). Squash-merging because the diffs interleave
across event_queue.rs, execution.rs, runtime.rs, and the producer sites,
with round-2 build-depending on round-1 (e.g. dispatch_handle_event<T>
signature requires OwnedFfiPtr<T>).

Round-1 findings:
- NULL callback at FFI boundary invoked UB before any Rust code ran.
  Fix: typedefs are now Option<extern "C" fn(...)>; an unwrap_cb_or_return!
  macro at every public entrypoint rejects NULL with InvalidArgument.
- Go SDK leaked cgo.Handle (and live VMs, for Create) when caller's ctx
  cancelled before the C task completed. Fix: abandonAsync /
  abandonAsyncErr / drainAndDelete helpers detach cleanup onto a
  goroutine so callers return ctx.Err() promptly AND the handle is
  reclaimed when the result eventually arrives.

Round-2 findings:
- Execution::wait held the inner mutex across recv().await, blocking
  kill/signal/resize_tty when wait was parked. Fix: split lock into
  inner (interface ops) and wait_state (OnceCell + rx); wait now &self.
  Knock-on: cargo clippy --fix removed obsolete `mut` at 58 callsites
  across test-utils, node SDK, and boxlite integration tests that no
  longer needed mutable bindings.
- push_event_with_capacity dropped events on close, leaking
  Box::into_raw'd payloads for 6 lifecycle variants. Fix: OwnedFfiPtr<T>
  smart-pointer reclaims the heap on Drop, take() disarms on dispatch.
- Per-execution cgo.Handle leaked because stream/exit callbacks were
  not strictly ordered. Fix: stream pumps signal completion via oneshot;
  exit_pump awaits all stream-dones before pushing Exit; execution_free
  synthesises an EXIT_CODE_FORCE_CLOSED Exit on abort. Go now safely
  deletes the handle in the exit callback.

Tests:
- src/boxlite/src/litebox/exec.rs: wait_does_not_block_kill (round-2 #1)
- sdks/c/src/event_queue.rs: owned_ffi_ptr_* + closed-queue tests (round-2 #2)
- sdks/c/tests/test_null_callback.c: NULL callback rejection (round-1)
- sdks/go/abandon_async_test.go: round-1 abandonAsync helpers +
  TestExecutionStreamState_HandleDeletedOnExit (round-2 #3)
DorianZheng added a commit that referenced this pull request May 9, 2026
stdout_pump and stderr_pump appended `b'\n'` to every chunk before
pushing the Stdout/Stderr event with a comment claiming to "restore
the line terminator that the upstream stream stripped". The
upstream does not strip terminators. Per
`boxlite::portal::interfaces::exec::route_output`:

    String::from_utf8_lossy(&chunk.data).to_string()

— raw byte chunks forwarded as-is, no line splitting, no newline
stripping. The variable name `line` was a fiction; chunks are
arbitrary byte boundaries.

Concrete corruption observed:
  - `printf hello` (no trailing newline) -> consumer saw `hello\n`.
  - Multi-chunk output -> extra `\n` injected at chunk boundaries.

Fix: delete `bytes.push(b'\n')` and the misleading comment in both
pumps. Rename local `line` -> `chunk` to match upstream terminology
and prevent future readers from making the same incorrect
assumption I did.

Reproducer (commit 04ca0459) flips red -> green:
  test ::stdout_pump_forwards_chunks_byte_exact ... ok
  test ::stderr_pump_forwards_chunks_byte_exact ... ok

The reproducer covers chunks with no trailing newline, multi-chunk
sequences, already-newlined chunks, and control-byte-bearing
chunks. Full C SDK lib suite: 37 passed, 0 failed.

REFLECT note: Upstream contract read (a 30-second `grep` on
`stdout_tx.send` in `src/boxlite/src/portal/`) would have caught
this. Recorded in the round-3 reflection.
DorianZheng added a commit that referenced this pull request May 9, 2026
* feat(c-ffi): post-and-drain async callback C API (phase 2 rust)

Replace ~17 sync block_on C functions with async + callback variants
backed by a per-runtime event queue. All async work spawns a Tokio task
that pushes a typed RuntimeEvent to the queue; users dispatch callbacks
on their own thread via boxlite_runtime_drain. Callbacks structurally
cannot fire on a Tokio worker, eliminating the worker-park bug class
that surfaced with the original boxlite_execute API.

Highlights:
  - sdks/c/src/event_queue.rs (new): EventQueue (Mutex+Condvar over
    VecDeque, capacity 4096), RuntimeEvent enum, push_event with
    cooperative yield on full queue, plus all CBox*/CExecution*/
    CRuntime* callback typedefs as extern "C" fn aliases.
  - sdks/c/src/runtime.rs: RuntimeHandle gains Arc<EventQueue>; add
    boxlite_runtime_drain (pop + dispatch with the queue lock released
    before user code) and convert boxlite_runtime_shutdown to async + cb.
  - sdks/c/src/box_handle.rs, copy.rs, images.rs, info.rs, metrics.rs:
    every block_on becomes a Tokio spawn that pushes the typed
    RuntimeEvent. Pre-spawn validation errors still return inline.
  - sdks/c/src/exec/execution.rs: replace boxlite_execute with
    boxlite_box_exec (sync handle creation) plus _on_stdout/_on_stderr/
    _on_exit registration that lazily spawns per-stream pumps. _wait,
    _kill, _resize_tty become async + cb. _write_stdin remains sync.
  - sdks/c/include/boxlite.h: regenerated by cbindgen with all callback
    typedefs, async signatures, and the drain function.
  - sdks/c/tests/*.c: remove the C-level integration tests that
    exercised the removed sync API. Coverage of those code paths now
    lives in the Go SDK + runner integration suite. test_simple_api.c
    is preserved (uses the still-sync boxlite_simple_* surface).

* test(c-ffi): callbacks fire on drain thread only (phase 2 regression guard)

Adds three focused unit tests in sdks/c/src/event_queue.rs that guard the
post-and-drain redesign's cardinal architectural invariant — callbacks
NEVER fire on Tokio worker threads. These directly catch the bug class
that motivated the redesign (May 2026 outage):

  [A] callbacks_fire_on_drain_thread_only — pushes 100 Stdout events from
      a 4-worker Tokio runtime, drains on the test thread, asserts the
      single recorded callback thread id == drain thread id and is
      disjoint from the captured worker-id set.
  [B] drain_blocks_user_thread_only — single-worker Tokio runtime, 5
      sleep-100ms events drained on a separate std thread; asserts a
      Tokio canary task keeps ticking while drain is blocked, proving
      the worker is free.
  [C] pump_yields_when_queue_full — single-worker Tokio runtime,
      capacity=4 queue, 100 events posted concurrently with a canary;
      asserts the canary makes progress, proving the producer
      cooperatively yields rather than monopolizing the worker.

Visibility tweak: extracted push_event_with_capacity as pub(crate) from
push_event so [C] can exercise the cooperative-yield path with a small
queue without touching the production QUEUE_CAPACITY default.

Stability: each new test passed 5/5 back-to-back runs. Full lib suite is
24 passing (21 existing + 3 new); clippy clean; fmt clean.

* feat(go-sdk): drain goroutine + cgo callbacks for async C API (phase 2 go)

Adapt the Go SDK to the post-and-drain callback C API. Each Runtime starts
a single drain goroutine that calls boxlite_runtime_drain in a loop;
callbacks therefore fire on a Go-owned thread, so cgo callbacks can write
to pipe writers, channels, and atomics without hijacking a Tokio worker.

Per-method changes:
  - All async ops (Create/Get/Remove/Start/Stop/CopyInto/CopyOut/Metrics/
    ListInfo/GetInfo/Shutdown/ImagePull/ImageList/ExecutionWait/Kill/
    ResizeTTY) post a cgo.Handle pointing at a buffered result channel and
    block on select(channel, ctx.Done()).
  - StartExecution registers stdout/stderr/exit callbacks via the C
    streaming API; the runner-facing ExecutionOptions{Stdout,Stderr,
    OnStdout,OnStderr,TTY} shape is unchanged.
  - Stdin writes call boxlite_execution_write_stdin synchronously.

bridge.{c,h} expose typed cb<X>() accessor functions that hand back the
Go-exported callback symbols cast to the cbindgen-generated typedefs, so
every cgo file in the package can pass C.cbXxx() directly into the API.

The streaming cgo.Handle is intentionally leaked: stream events and the
exit event are pushed concurrently by independent Rust pumps with no
strict ordering guarantee, so deleting the handle from the exit callback
could race a sibling stream callback's value lookup. The leak is bounded
to one map entry plus a small struct per execution.

* fix(shim): emit @loader_path/$ORIGIN rpath from shim's own build.rs

The boxlite-shim split out of the boxlite crate in 3193f50 but the
rpath link-args were left behind in src/boxlite/build.rs, where they
no longer apply (boxlite is rlib-only post-refactor — cargo:rustc-link-arg
is scoped to the emitting crate's binaries). Without LC_RPATH, the
shim binary's libkrun cannot dlopen libkrunfw.<X>.dylib collected next
to it in the runtime directory.

Move the link-args to a new src/shim/build.rs and delete the dead
emission in boxlite's build.rs.

Verified: shim now has LC_RPATH @loader_path; make test:integration:c
passes (was failing with "Couldn't find or load libkrunfw.5.dylib").

* fix(shim): move --allow-multiple-definition link-arg from boxlite to shim

CI on Linux x86_64 fails with:
  invalid instruction `cargo:rustc-link-arg-bins` from build script of
  `boxlite v0.9.3`. The package boxlite v0.9.3 does not have a bin target.

Same root cause as the rpath fix in this PR: boxlite is rlib-only
post-#494, so cargo:rustc-link-arg-bins is rejected. Move the linker
flag to src/shim/build.rs (where the libkrun staticlib std-symbol
conflict is actually resolved at link time), keep cargo:rustc-link-arg-
tests on boxlite for its integration test binaries.

* chore(skills): add adversarial-iteration skill

Captures the implement -> review -> reflect -> reproducer-first -> fix
loop used during the round-1/round-2 Codex review iterations on this
branch. Encodes the four cheap pre-implementation checks (timeline draw,
upstream contract read, full lifecycle trace, variant enumeration) and
the recurring failure shapes seen in this codebase, so future iterations
can REFLECT on root cause instead of patching symptoms.

* fix(c-ffi): round-1 + round-2 review follow-ups

Bundles two adversarial-review rounds on the post-and-drain async
callback C/Go SDK (PR #495). Squash-merging because the diffs interleave
across event_queue.rs, execution.rs, runtime.rs, and the producer sites,
with round-2 build-depending on round-1 (e.g. dispatch_handle_event<T>
signature requires OwnedFfiPtr<T>).

Round-1 findings:
- NULL callback at FFI boundary invoked UB before any Rust code ran.
  Fix: typedefs are now Option<extern "C" fn(...)>; an unwrap_cb_or_return!
  macro at every public entrypoint rejects NULL with InvalidArgument.
- Go SDK leaked cgo.Handle (and live VMs, for Create) when caller's ctx
  cancelled before the C task completed. Fix: abandonAsync /
  abandonAsyncErr / drainAndDelete helpers detach cleanup onto a
  goroutine so callers return ctx.Err() promptly AND the handle is
  reclaimed when the result eventually arrives.

Round-2 findings:
- Execution::wait held the inner mutex across recv().await, blocking
  kill/signal/resize_tty when wait was parked. Fix: split lock into
  inner (interface ops) and wait_state (OnceCell + rx); wait now &self.
  Knock-on: cargo clippy --fix removed obsolete `mut` at 58 callsites
  across test-utils, node SDK, and boxlite integration tests that no
  longer needed mutable bindings.
- push_event_with_capacity dropped events on close, leaking
  Box::into_raw'd payloads for 6 lifecycle variants. Fix: OwnedFfiPtr<T>
  smart-pointer reclaims the heap on Drop, take() disarms on dispatch.
- Per-execution cgo.Handle leaked because stream/exit callbacks were
  not strictly ordered. Fix: stream pumps signal completion via oneshot;
  exit_pump awaits all stream-dones before pushing Exit; execution_free
  synthesises an EXIT_CODE_FORCE_CLOSED Exit on abort. Go now safely
  deletes the handle in the exit callback.

Tests:
- src/boxlite/src/litebox/exec.rs: wait_does_not_block_kill (round-2 #1)
- sdks/c/src/event_queue.rs: owned_ffi_ptr_* + closed-queue tests (round-2 #2)
- sdks/c/tests/test_null_callback.c: NULL callback rejection (round-1)
- sdks/go/abandon_async_test.go: round-1 abandonAsync helpers +
  TestExecutionStreamState_HandleDeletedOnExit (round-2 #3)

* fix(c-ffi): claim-once Exit dispatch (round-3 #1)

execution_free races exit_pump on Exit dispatch. execution_free
reads `completed = false`, then proceeds through block_on(kill+wait)
and a synth-Exit push; concurrently, exit_pump sets
`completed = true` and pushes its own Exit. Both Exits land in
the queue. The Go SDK's exit callback deletes the shared
cgo.Handle on the first Exit; the second tries to Value() a
freed handle and panics the drain goroutine.

Fix: split the conflated `completed: Mutex<bool>` into two
distinct AtomicBool fields, each with a single responsibility:

  - process_completed:  "process exited naturally" -- gates the
                        kill+wait branch in execution_free.
  - exit_dispatched:    "Exit was published to the queue" -- both
                        execution_free and exit_pump
                        compare_exchange(false, true, AcqRel,
                        Acquire); only the claimer pushes Exit.

The two-AtomicBool design replaces a Mutex-based check-then-act
guard that had a TOCTOU window arbitrarily wide
(block_on(kill+wait) sat inside it). compare_exchange is the
idiomatic claim-once primitive; conflating process state and
dispatch state is what created the original bug.

Reproducer (commit 09e73ac2) flips red -> green:
  test ::race_produces_at_most_one_exit_event ... ok

Helper-level invariant verified: two concurrent dispatches
through `dispatch_exit_test_helper` produce exactly 1 Exit
event regardless of interleaving.

Adjacent contracts to verify (out of scope for this commit;
covered by integration tests in subsequent loop iterations):
  - Force-close after natural completion: exit_pump claims,
    execution_free sees claim taken, skips push.
  - Force-close before natural completion: execution_free
    claims, exit_pump (still alive) sees claim taken, skips push.
  - Process-already-completed: execution_free skips kill+wait
    on process_completed=true; still attempts the dispatch
    claim (in case exit_pump hadn't pushed yet).

* fix(c-ffi): forward stream chunks byte-exact (round-3 #2)

stdout_pump and stderr_pump appended `b'\n'` to every chunk before
pushing the Stdout/Stderr event with a comment claiming to "restore
the line terminator that the upstream stream stripped". The
upstream does not strip terminators. Per
`boxlite::portal::interfaces::exec::route_output`:

    String::from_utf8_lossy(&chunk.data).to_string()

— raw byte chunks forwarded as-is, no line splitting, no newline
stripping. The variable name `line` was a fiction; chunks are
arbitrary byte boundaries.

Concrete corruption observed:
  - `printf hello` (no trailing newline) -> consumer saw `hello\n`.
  - Multi-chunk output -> extra `\n` injected at chunk boundaries.

Fix: delete `bytes.push(b'\n')` and the misleading comment in both
pumps. Rename local `line` -> `chunk` to match upstream terminology
and prevent future readers from making the same incorrect
assumption I did.

Reproducer (commit 04ca0459) flips red -> green:
  test ::stdout_pump_forwards_chunks_byte_exact ... ok
  test ::stderr_pump_forwards_chunks_byte_exact ... ok

The reproducer covers chunks with no trailing newline, multi-chunk
sequences, already-newlined chunks, and control-byte-bearing
chunks. Full C SDK lib suite: 37 passed, 0 failed.

REFLECT note: Upstream contract read (a 30-second `grep` on
`stdout_tx.send` in `src/boxlite/src/portal/`) would have caught
this. Recorded in the round-3 reflection.

* fix(go-sdk): broadcast Close to in-flight async callers (round-3 #3)

Runtime.Close stopped the drain goroutine BEFORE freeing the C
runtime handle. Any concurrent async caller (Create, Get, Shutdown,
Pull, ImageList, Info, Metrics, Start, Stop, Copy, Wait, Kill,
ResizeTTY, ListInfo, GetInfo) blocked on
`select { case res := <-ch: case <-ctx.Done(): }` would never
receive a result after stopDrain killed the only goroutine that
pumps events from C to Go. With a non-cancellable ctx, the caller
blocked forever and its abandonAsync cleanup goroutine (which
shares the same ch) also stranded — leaking cgo.Handles
unboundedly across long-lived processes.

Fix: a `closing chan struct{}` on Runtime, closed by Close BEFORE
stopDrain runs. Every async caller's select now has three cases:
ch, ctx.Done(), and closing. When Close fires `closing`, all
in-flight callers wake up and return ErrRuntimeClosed
(*Error{Code: ErrInvalidState}). The detached abandonAsync /
abandonAsyncErr / drainAndDelete goroutines also select on
closing so their cgo.Handle Delete runs immediately rather than
waiting forever for a Tokio task whose result event will be
dropped by the now-closed C queue.

Why closing-channel broadcast (not context.Context, not
sync.Cond): closed channels are the idiomatic Go signal-many
primitive; readers familiar with `<-ctx.Done()` recognise the
pattern instantly. Adds one field + one extra select case per
call site. No new types, no new dependency.

Files touched (15 select sites + 3 helper signatures + Runtime
struct + Close ordering):
  runtime.go: Runtime{closing,closingOnce}, NewRuntime init,
              Close closes closing first, Shutdown/Create/Get/
              removeBox selects, abandonAsync helpers refactored.
  errors.go:  ErrRuntimeClosed sentinel.
  exec.go:    Execution{closing}, StartExecution wires it,
              Wait/Kill/ResizeTTY selects.
  images.go:  Pull/List selects.
  info.go:    ListInfo/GetInfo selects.
  metrics.go: Runtime.Metrics/Box.Metrics selects.
  box_handle.go: Start/Stop selects.
  copy.go:    CopyInto/CopyOut selects.

Reproducer (commit 3642e2c1) flips red -> green:
  TestRuntimeCloseDoesNotStrandPendingPull ... ok
  Pull unblocked 9.389333ms after Close; err = boxlite: runtime closed (code=4)

Adjacent-contract tests (also passing):
  TestAbandonAsync_RespondsToCloseSignal
  TestAbandonAsyncErr_RespondsToCloseSignal
  TestDrainAndDelete_RespondsToCloseSignal
plus the existing round-1/round-2 helper tests retrofitted to
the new (ch, h, closing[, cleanup]) signatures.

REFLECT note: full lifecycle trace (the cheap check) drawn
during REFLECT showed the gap between stopDrain and free was
where in-flight callers got stranded; the fix puts a wakeup
mechanism in front of that gap.

* fix(c-ffi): type-aware OwnedFfiPtr destructor (round-3 #4)

OwnedFfiPtr<T>::drop reconstructed `Box::from_raw(ptr)` and let `T`'s
default Drop run. For Rust types with their own Drop (CBoxHandle is
the only such variant in our usage) this is sufficient. For C-ABI
repr(C) FFI structs (CImagePullResult, CImageInfoList, CBoxInfo,
CBoxInfoList), Drop runs no destructor for raw pointer fields, so
`CString::into_raw`'d allocations and `Vec::from_raw_parts` items
leaked when the queue closed mid-flight (which OwnedFfiPtr exists
to handle).

Fix: extend OwnedFfiPtr<T> with an optional type-aware destructor
hook. Producers who own a Rust-Drop-sufficient payload still call
`OwnedFfiPtr::new(boxed)` (Box::drop runs). Producers who own an
FFI struct with nested allocations call
`OwnedFfiPtr::new_with(boxed, free_xyz_ptr)`, where `free_xyz_ptr`
is the matching `boxlite_free_*` body (made `pub(crate)`). Drop
dispatches based on the stored function pointer.

Closure-parametric (over a single OwnedFfiPtr<T>) was chosen over
4 type-specific Owned* wrappers (Option A from REFLECT) because:
  - Single type, single API surface — minimum diff to existing code.
  - Producer site is explicit about which destructor applies via
    the constructor choice.
  - `unsafe fn(*mut T)` is a Copy zero-sized fn pointer — no
    runtime overhead beyond the existing AtomicPtr.

Producer sites updated to `new_with`:
  - images.rs: image_pull -> free_image_pull_result,
              image_list -> free_image_info_list.
  - info.rs:   get_info  -> free_box_info_ptr,
              box_list  -> free_box_info_list.
CBoxHandle producers (box_handle.rs) keep `new` — Rust struct,
Box::drop suffices.

Reproducer (commit b358542a) flips red -> green: all 4 leaky
variants reclaim the expected number of inner CStrings.

Adjacent contracts also tested:
  - take() must disarm the type-aware destructor (otherwise the C
    consumer would double-free) -- new test
    `owned_ffi_ptr_with_free_take_disarms_drop`.
  - CBoxHandle path unchanged (existing
    `owned_ffi_ptr_reclaims_allocation_on_drop` still passes).

Test isolation: a `pub(crate) static FREE_STR_LOCK: Mutex<()>`
serializes the leak-reproducer tests so parallel cargo runs don't
interleave their `FREE_STR_CALLS` snapshots.

Full C SDK lib: 42 passed, 0 failed, 1 ignored.

REFLECT note: variant enumeration (the cheap check) — listing all
6 payload types and writing down each one's destructor obligations
— would have caught this in 10 minutes. Recorded in the round-3
reflection.

* fix(c-ffi): wait for exit_pump dispatch instead of abort (round-4 B)

Codex round-4 finding: exit_pump's claim-then-push pattern (round-3 #1)
flips `exit_dispatched=true` BEFORE `push_event.await` returns. Under
queue backpressure the push yields cooperatively. During that yield
window, `execution_free` would:
  1. Read `exit_dispatched == true` → skip its synth Exit push.
  2. Call `pumps.abort()` → tear down exit_pump mid-yield.
Net: claim taken, no Exit pushed. Go SDK exit callback never fires,
cgo.Handle leaks, "exactly one Exit per execution" invariant violated.

Fix: separate exit_pump from the abortable stream-pump set. Track it
in a dedicated `exit_pump_handle: Mutex<Option<JoinHandle<()>>>`.
`execution_free` now branches:

  - We win the dispatch claim: push synth Exit, then abort exit_pump
    (which will see the claim taken and no-op when its push resumes).
  - exit_pump won the claim: wait (bounded 5s) for the task to finish
    so its push completes, instead of aborting it.

Stream pumps continue to be aborted as before.

Why wait-for-completion (option a from the test commit) over a
tri-state machine (option b): smaller diff, follows the "boring code"
principle — "wait for the task that took the lock" is more obvious
than "tri-state with timeout fallback". The 5s timeout bounds
worst-case teardown if the drainer is genuinely stuck.

Reproducer (commit a9b8bd08) updated to mirror the new flow:
  - capacity-1 queue + filler event.
  - spawn exit_pump-shaped task (claim, push, yield on backpressure).
  - on observation of the claim, spawn a drain task that pops the
    filler after 20ms.
  - wait (bounded) for the pump to finish, instead of aborting.
  - assert marker Exit reached the queue.

  test ::aborted_exit_pump_after_claim_must_still_dispatch_exit ... ok

Polarity: the test was reverse-engineered to fail under the old
abort-then-drain shape (committed at a9b8bd08); after this fix the
production wait-for-completion shape passes. Full C SDK lib: 43
passed, 0 failed, 1 ignored.

REFLECT note (round-4 unifying shape): "Helper-level test passed
the primitive's contract; system invariant under stress (saturated
queue + shutdown abort race) failed." The round-3 #1 helper test
verified two concurrent claims collapse to one push, but it never
saturated the queue or exercised abort-mid-push. Recorded for the
skill's Common Shapes list.

Round-4 finding A (cgo.Handle race during Close) — separate next
iteration.

* fix(go-sdk): single-path cgo.Handle ownership (round-4 finding A)

Codex round-4 finding A: Runtime.Close fires r.closing BEFORE
stopDrain returns. During the up-to-100ms drain blocking call between
close(r.closing) and stopDrain completing, the drain goroutine may
still dispatch a queued C event whose Go callback Value+Delete's the
same cgo.Handle that abandonAsync's closing branch (round-3 #3 fix)
is also Deleting. cgo.Handle.Delete panics on double-Delete; Value()
panics if Delete already ran. Either path winning the race produces
a panic, turning a clean shutdown into a process crash.

Fix: single-path ownership via a global registry of active handles.

  - `var activeHandles sync.Map` (key: uintptr(handle)).
  - `registerHandleForDispatch(cgo.NewHandle(...))` at every async-op
    call site (16 sites across runtime.go, exec.go, images.go,
    info.go, metrics.go, box_handle.go, copy.go).
  - `claimHandleForDispatch(h) bool` does atomic LoadAndDelete; first
    caller wins (returns true), subsequent callers no-op (return
    false). Exactly one caller proceeds to Value/Delete.
  - `deleteHandleForDispatch(h)` is the claim-then-Delete shorthand
    used by synchronous-failure paths and abandonAsync helpers.

Every Value/Delete site now claims first:

  - bridge_callback.go (12 dispatch callbacks): claim before
    `defer h.Delete()`. If claim fails, return without touching the
    handle — abandonAsync's closing branch already owns it.
  - abandonAsync / abandonAsyncErr / drainAndDelete: claim in BOTH
    the ch and closing branches. (This also fixes a latent
    double-Delete from round-1 where the dispatch's `defer h.Delete()`
    raced abandonAsync's `case res := <-ch:` h.Delete() — same
    primitive eliminates both.)
  - Synchronous-failure paths (`if code != C.Ok { ... }`): use
    `deleteHandleForDispatch` for symmetry; here there's no race but
    consistency keeps the pattern uniform.

Why a global sync.Map vs per-Runtime registry: dispatch callbacks
in bridge_callback.go only have `userData *unsafe.Pointer` — no
Runtime reference. Threading Runtime through every callback would
be invasive. The handle's uintptr is unique within the process so
a global map is the minimum-friction shape.

Why LoadAndDelete vs sync.Once-per-handle: LoadAndDelete is one
atomic op on an existing data structure. sync.Once-per-handle would
need a per-handle Once value (extra struct allocation per call).

Reproducer (commit 7749b68b) flips red -> green:
  test TestHandleOwnership_NoDoubleDeleteRaceDuringClose ... PASS
  (500/500 panics on the unfixed code; 0/500 after fix.)

Adjacent contracts (all passing):
  - TestAbandonAsync_RunsCleanupOnSuccess
  - TestAbandonAsync_SkipsCleanupOnError
  - TestAbandonAsyncErr_DeletesHandle
  - TestDrainAndDelete_DeletesHandle
  - TestAbandonAsync_RespondsToCloseSignal
  - TestAbandonAsyncErr_RespondsToCloseSignal
  - TestDrainAndDelete_RespondsToCloseSignal
  - TestExecutionStreamState_HandleDeletedOnExit (round-2 #3)
  - TestRuntimeCloseDoesNotStrandPendingPull (round-3 #3)

Streaming exec handle (`streamHandle` in exec.go, owned by
dispatchExit in bridge_callback.go) is intentionally NOT registered:
it has only one Delete path (the per-execution exit dispatch
chain — round-2 #3 ordering invariant), no race to coordinate.

REFLECT note: the unifying shape across rounds 3-4 is "two paths
both taking ownership of the same resource without coordination."
Round-3 #1 (Exit dispatch) addressed it with compare_exchange on
exit_dispatched at the C-FFI level. Round-4 finding A applied the
same shape to cgo.Handle ownership at the Go SDK level.

* fix(c-ffi+go-sdk): teardown ordering + payload reclamation (round-6)

Codex round-6 surfaced two siblings of prior teardown findings.
Both are addressed in this single commit because they share the
"forced-tear-down path didn't preserve invariants the natural-path
established" shape.

Finding 1 (sdks/c/src/exec/execution.rs:566-600): execution_free's
force-close path published the synthetic Exit BEFORE aborting
stdout/stderr pumps. Under backpressure or scheduling, a stream pump
could enqueue Stdout/Stderr AFTER the synth Exit. The Go exit
callback Deletes the per-execution cgo.Handle; a later stream
callback would Value() a deleted handle and panic. This violated
the round-3 #1 "Exit is strictly last" invariant on the force-close
path (the natural-completion path via exit_pump correctly awaits
each stream pump's done_tx before pushing Exit; force-close
skipped that step).

Fix: abort stream pumps FIRST in execution_free, then run the
exit-dispatch claim race + synth Exit push. Aborting drops a
pump's future before the next push_event re-check, so no
Stdout/Stderr lands after the Exit. exit_pump (handled separately
via the dedicated exit_pump_handle from round-4 #B) is unaffected.

Finding 2 (sdks/go/runtime.go:387-405 and bridge_callback.go):
when abandonAsync's closing branch claimed the cgo.Handle (round-4
#A primitive), a queued success callback that fired AFTER the
claim returned silently without freeing its received C-side
payload. Rust had already transferred ownership via
OwnedFfiPtr::take() before invoking the callback, so neither side
reclaimed the heap. For Create this leaks a live VM. For
Get/ImagePull/ImageList/Info/InfoList it leaks the OwnedFfiPtr<T>
allocation including its nested CString::into_raw'd inner pointers.

Fix: introduce `claimOrFreePayload[P](h, payload, free)` — the
dispatch-callback gate that claims, and on claim-failure invokes
`free(payload)` before returning. Each affected callback in
bridge_callback.go (5 sites: Create, Get, ImagePull, ImageList,
Info, InfoList) is updated to use it with the matching
`boxlite_free_*` C function. Error-only callbacks (Start/Stop/
Remove/Copy/Shutdown/Wait/Kill/Resize) keep using bare
claimHandleForDispatch — no payload to reclaim.

Reproducer (commit cc483053) flips red -> green:
  test TestClaimOrFreePayload_FreesPayloadWhenClaimAlreadyTaken ... PASS
  (got 0 invocations on the unfixed stub; got 1 after fix.)

Adjacent-contract tests (also passing):
  TestClaimOrFreePayload_DoesNotFreeWhenClaimSucceeds — claim-success
  path does NOT call free (caller owns the payload normally).
  Plus all round-3 + round-4 reproducer tests + helper tests.

Note on finding 1: I did not write a separate runtime test because
the bug is fundamentally timing-dependent (race between stream
pump's push_event yield and execution_free's abort). Code-review
verification is the regression guard; the round-3 #1 invariant
"Exit is strictly last" remains the explicit contract.

Round-6 unifying shape (recorded for future REFLECT): "Forced-
tear-down path didn't preserve invariants the natural-path
established." Round-3 #1 (Exit ordering) addressed the natural
path; round-4 #B added force-close ordering for exit_pump itself;
round-6 finding 1 extends that to stream pumps; round-6 finding 2
extends payload reclamation to claim-failure paths. The recurring
sibling pattern across rounds 3-6 suggests the close-path design
warrants a single architectural pass once the immediate fixes
land.

* fix(c-ffi): only mark process_completed on Ok wait (round-8 F1)

Codex round-8 F1: round-7's unconditional `process_completed=true`
treated a failed wait as proof the process exited. For
untrusted/sandboxed code, a wait_on_clone error (e.g. transport
loss to the guest agent) may mean the process is still running.
execution_free uses process_completed to skip kill+wait cleanup;
under round-7 a transient error caused the handle to be freed
without termination, leaving a live process running.

Fix: guard the store with `result.is_ok()`. Errored waits leave
process_completed false; execution_free correctly retries
kill+wait. Ok waits set the flag and skip the redundant cleanup
(the round-7 invariant — "natural exit shouldn't get
EXIT_CODE_FORCE_CLOSED layered on top" — is preserved).

Reproducer (commit 63b883a6) flips red -> green:
  test ::execution_wait_must_not_mark_process_completed_on_error ... ok

Plan-mode discipline (per skill, no user gate):
  Approaches considered:
    (A) Only on Ok. RECOMMENDED, picked. Smallest change; matches
        the security invariant "wait observed != process exited".
    (B) Split into wait_observed + process_exited. Bigger refactor;
        deferred to architectural pass for round-9 B (F2 + F3).

  Codex's round-8 F1 recommendation explicitly named (A); picking
  it directly.

Adjacent contracts (verified):
  - Empty-handle wait (Err) leaves process_completed=false.
  - Successful wait still sets process_completed=true (round-7
    invariant; covered by boxlite/src/litebox/exec.rs::tests where
    a stub backend can produce Ok results).
  - exit_pump path unchanged.
  - Full C SDK lib: 44 passed.

Round-8 F2 (Box ops schedule then check closing) and F3 (Create
cleanup skipped on close) remain — these are siblings of the
recurring teardown shape and are addressed in round-9 B as the
3-phase shutdown architectural change.

* chore(c-ffi+go-sdk): drop iteration-round narrative from source comments

The 8 rounds of Codex review on this branch landed with commit-message
narrative ("round-N", "Codex finding #M", "F1") leaked into source-file
comments. Per CLAUDE.md, comments must explain why, not how a fix
evolved — and references to specific commits rot as the codebase
changes. The history is preserved in git.

This commit cleans the production-source side:

- sdks/c/src/lib.rs            : drop "Codex round-3 finding #4" pointer
- sdks/c/src/event_queue.rs    : drop round-1/2/3/4 references in
                                 OwnedFfiPtr docs and test prologues
- sdks/c/src/exec/execution.rs : drop round-3/4/6/7/8 references in
                                 field docs, execution_free, and the
                                 dead "round-7 reproducer" doc block
                                 whose test was deleted by round-8
- sdks/go/runtime.go           : drop "(added by round-3 #3)" from
                                 activeHandles doc
- sdks/go/bridge_callback.go   : collapse 5x copy-pasted "(round-4
                                 finding A)" comment block to a single
                                 "see claimHandleForDispatch" line
                                 (the helper's doc already explains it)

No code changes — comments only. All boxlite-c (44) and boxlite (672)
tests pass; cargo clippy --tests clean for both packages; go vet and
go build clean for sdks/go.

* chore(tests): describe invariants instead of review rounds

Section headers and assertion messages in test files referenced
"Codex round-N finding #M" rather than the invariant under test.
Future debuggers reading a failure see the bug, not the review
iteration that found it.

- sdks/c/src/tests.rs               : "Codex finding #3" header → "NULL-callback rejection"
- sdks/go/abandon_async_test.go     : drop round-2/3 references in headers and bodies
- sdks/go/handle_payload_test.go    : drop round-4/6 references; describe the leak path
- sdks/go/handle_race_test.go       : drop round-3/4 references; describe the race
- sdks/go/runtime_close_test.go     : drop "(round-3 #3)" from a t.Fatalf message
                                      a future debugger would see in CI logs
- src/boxlite/src/litebox/exec.rs   : "Codex round-2 finding #1" header →
                                      "wait must not block kill"

* chore(skills): add COMPACT step to adversarial-iteration loop

Document a per-round compaction step (squash test+fix, audit round-N
narrative out of source comments, tree-identity check) so the next
adversarial review reads canonical code rather than a debugging
notebook. Also tighten the trigger/non-trigger guidance.
nieyy added a commit to nieyy/boxlite that referenced this pull request May 20, 2026
#1 (security, blocking): fix root-home asymmetry in boxlite-disable-ssh.
The script hard-coded HOME_DIR="/home/$UNIX_USER", so disable for root
targeted /home/root/.ssh/.ssh_enabled (non-existent) instead of
/root/.ssh/.ssh_enabled, leaving the marker on disk and allowing
boxlite-ensure-ssh to silently re-enable sshd on the next VM restart —
bypassing API-level revoke on the default code path.

boxlite-ai#2: update sshUserOrDefault() doc and test to reflect "root" default.
The comment claimed "boxlite" and TestSSHUserDefaultsToBoxlite asserted
the old value; both now match the implementation.

boxlite-ai#3: delete per-box mutex in cleanupSSHOnDestroy to prevent unbounded
growth of boxSSHMu across the runner process lifetime.

boxlite-ai#4: redact SSH bearer tokens in gateway log lines with redactToken()
(keeps 8-char prefix for correlation, replaces remainder with "…").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
G4614 referenced this pull request in G4614/boxlite May 22, 2026
… requirement

`BOXLITE_LIBKRUNFW_PRIVILEGED_PATH` was a build-time include guard that only
ever pointed at one possible path —
`<workspace>/target/privileged-kernel/lib64/libkrunfw-privileged.so.5`, the
exact output of `make libkrunfw-privileged`. The user-visible signal it
carried was a boolean ("is the privileged blob present?") that `build.rs`
can answer itself by calling `path.exists()`. As a result the env var
created an easy-to-miss step: a developer who ran `make libkrunfw-privileged`
but forgot the export would silently get a lean-only build, and the
mistake only surfaced much later as a confusing dind / iptables failure
inside the VM. PR boxlite-ai#568 review #2's fail-fast guard plastered the symptom;
this removes the source of the problem.

After this change:

- `libkrun-sys/build.rs::locate_privileged_blob` checks the env var first
  (preserved as an explicit override for CI caches / distro packaging
  paths outside the workspace), then falls back to the canonical path.
  A registered `cargo:rerun-if-changed=<canonical>` line means cargo
  rebuilds automatically when the blob appears, disappears, or changes.
- `make libkrunfw-privileged` followed by `make runtime:debug` is now
  the single-button workflow. No `export` line, no second step to forget.
- `make/test.mk::test:integration:cli` drops its `@export …` line — the
  same auto-detect handles tests.
- Test doc comment, `spawn.rs` runtime-error hint, build-script footer,
  and `make/build.mk` overview all updated to point at the make target
  instead of the env var.

Companion fix for a stale-artifact bug surfaced while debugging the rename:

- `libkrun-sys`'s `OUT_DIR` is persistent across cargo runs, so any
  privileged-variant blob staged by a prior build (including the
  pre-rename `libkrunfw-dind.so.5`) lingered in `lib64/` and got bundled
  into `boxlite`'s embedded runtime forever — verified live: the prior
  build's embedded-runtime warning sequence included both
  `Bundled library: libkrunfw-privileged.so.5` and the dead
  `Bundled library: libkrunfw-dind.so.5`. New
  `clean_stale_libkrunfw_variants` sweep removes any `libkrunfw-*.so*`
  before `LibFixup::fix` walks the directory, so the only privileged
  blob that survives is the one this run stages.
- `boxlite/build.rs::prune_stale_libkrunfw_variants` mirrors the
  cleanup at the bundle layer: when copying libkrunfw files into the
  embedded `runtime/`, it drops any `libkrunfw-<variant>.so*` already
  in the destination that the current source no longer produces. Lean
  `libkrunfw.so*` files (no dash) are never touched.

Verification (env var deliberately unset; canonical-path detection only):

  - cargo check + cargo build -p boxlite-cli --tests: clean
  - runtime:debug: auto-staged the canonical privileged blob without
    BOXLITE_LIBKRUNFW_PRIVILEGED_PATH set
  - Embedded runtime shrunk 8 files / 423.9 MB → 7 files / 403.5 MB
    (stale libkrunfw-dind.so.5 gone)
  - boxlite + boxlite-shim + boxlite-guest unit tests: 23/23 PASS
  - guest container::executor (Review #3 EPERM path): 3/3 PASS
  - dind_supports_docker_build (single nextest invoke): PASS 32.76s
  - dind_compose_multi_service_network (single nextest invoke):
    PASS 41.74s — DNS + ICMP + TCP service-to-service all green
  - dind stacked --test-threads=1: 2/2 PASS 67.44s
  - `find target -name "libkrunfw-dind*" -newer <baseline>`: empty
    — no new build session produced the stale variant

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lilongen referenced this pull request in lilongen/boxlite May 29, 2026
…ew fixes

Real-AWS B-tier re-run through /api/admin/runner-ops/*: masked apiKey (#1),
cooperative cancel → CANCELLED with no EC2 leak (#2), lock renewal across the
10-stage scale-down (#3), and the full add+add+place+scale-down-with-live-migration
happy path — all PASS. Test runners/boxes torn down; default/prod and dev DB untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant